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TO LOUISE 




PREFACE 

Now that this book is done I can look back, and identify two purposes which 
led me to begin writing, and which have guided the work to completion. 

One good practical purpose was to bring together and assess the wealth of data 
provided over the last decade by new techniques in experimental physics and in 
optical, radio, radar, X-ray, and infrared astronomy. Of course, new data will 
keep coming in even as the book is being printed, and I cannot hope that this work 
will remain up to date forever. I do hope, however, that by giving a comprehensive 
picture of the experimental tests of general relativity and observational cosmology, 
I will help to prepare the reader (and myself) to understand the new data as they 
emerge. I have also tried to look a little way into the future, and to discuss what 
may be the next generation of experiments, especially those based on artificial 
satellites of the earth and sun. 

There was another, more personal reason for my writing this book. In learning 
general relativity, and then in teaching it to classes at Berkeley and I 

became dissatisfied with what seemed to be the usual approach to the subject. 
I found that in most textbooks geometric ideas were given a starring role, so that a 
student who asked why the gravitational field is represented by a metric tensor, or 
why freely falling particles move on geodesics, or why the field equations are 
generally covariant would come away with an impression that this had something 
to do with the fact that space-time is a Riemannian manifold. 

Of course, this was Einstein’s point of view, and his preeminent genius 
necessarily shapes our understanding of the theory he created. However, I believe 
that the geometrical approach has driven a wedge between general relativity 
and the theory of elementary particles. As long as it could be hoped, as Einstein 
did hope, that matter would eventually be understood in geometrical terms, it 
made sense to give Riemannian geometry a primary role in describing the theory 
of gravitation. But now the passage of time has taught us not to expect that the 
strong, weak, and electromagnptic interactions can be understood in geometrical 
terms, and too great an emphasis on geometry can only obscure the deep con- 
nections between gravitation and the rest of physics. 
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In place of Riemannian geometry, I have based the discussion of general 
relativity on a principle derived from experiment : the Principle of the Equivalence 
of Gravitation and Inertia. It will be seen that geometric objects, such as the 
metric, the affine connection, and the curvature tensor, naturally find their way 
into a theory of gravitation based on the Principle of Equivalence and, of course, 
one winds up in the end with Einstein’s general theory of relativity. However, I 
have tried here to put off the introduction of geometric concepts until they are 
needed, so that Riemannian geometry appears only as a mathematical tool for 
the exploitation of the Principle of Equivalence, and not as a fundamental basis 
for the theory of gravitation. 

This approach naturally leads us to ask why gravitation should obey the 
Principle of Equivalence. In my opinion the answer is not to be found in the realm 
of classical physics, and certainly not in Riemannian geometry, but in the con- 
straints imposed by the quantum theory of gravitation. It seems to be impossible 
to construct any Lorentz-invariant quantum theory of particles of mass zero and 
spin two, unless the corresponding classical field theory obeys the Principle of 
Equivalence. Thus the Principle of Equivalence appears as the best bridge 
between the theories of gravitation and of elementary particles. The quantum 
basis for the Principle of Equivalence is briefly touched upon here in a section on 
the quantum theory of gravitation, but it was not possible to go far into the 
quantum theory in this book. 

The nongeometrical approach taken in this book has, to some extent, affected 
the choice of the topics to be covered. In particular, I have not discussed in detail 
the derivation and classification of complicated exact solutions of the Einstein 
field equations, because I did not feel that most of this material was needed for a 
fundamental understanding of the theory of gravitation, and hardly any of it 
seemed to be relevant to experiments that might be carried out in the foreseeable 
future. By this omission, I have left out much of the work done by professional 
geperal relativists over the past decade, but I have tried to provide an entree 
to this work through references and bibliographies. I regret the omission here of a 
detailed discussion of the beautiful theorems of Penrose and Hawking on gravi- 
tational collapse ; these theorems are briefly discussed in Sections 11.9 and 15.11, 
but an adequate discussion would have taken up too much time and space. 

I have tried to give a comprehensive set of references to the experimental 
literature on general relativity and cosmology. I have also given references to 
detailed theoretical calculations whenever I have quoted their results. However, I 
have not tried to give complete references to all the theoretical material discussed 
in the book. Much of this material is now classical, and to search out the original 
references would be an exercise in the history of science for which I did not feel 
equipped. The mere absence of literature citations should not be interpreted as a 
claim that the work presented is original, but some of it is. 

It is a pleasure to acknowledge the inestimable help I have received in writing 
this book. Students in my classes over the past seven years have, by their questions 
and comments, helped to free the calculations of errors and obscurities. I especially 
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thank Jill Punsky for carefully checking many of the derivations. I have drawn 
very heavily on the knowledge of many colleagues, including Stanley Deser, 
Robert Dicke, George Field, Icko Iben, Jr., Arthur Miller, Philip Morrison, Martin 
Rees, Leonard Schiff, Maarten Schmidt, Joseph Weber, Rainier Weiss, and espe- 
cially Irwin Shapiro. Finally, I am greatly indebted to Connie Friedman and Lillian 
Horton for typing and retyping the manuscript with inexhaustible skill and patience. 

Steven Weinberg 

Cambridge, Massachusetts 
April 1971 


NOTATION 

Latin indices i, j, Jc, l, and so on generally run over three spatial coordinate labels, 
usually, 1, 2, 3 or x, y , z. 

Greek indices a, B, y, S, and so on generally run over the four space-time inertial 
coordinate labels 1, 2, 3, 0 or x , y, z, t. 

Greek indices y, v, k, X, and so on generally run over the four coordinate labels in a 
general coordinate system. 

Repeated indices are summed unless otherwise indicated. 

The metric in an inertial coordinate system has diagonal elements +1, +1, 

+ 1, “I- 

A dot over any quantity denotes the time derivative of that quantity. 

Cartesian three-vectors are indicated by boldface type. 

The speed of light is taken to be unity, except when c.g.s. units are indicated. 
Planck’s constant is not taken to be unity. 
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PART ONE 
PRELIMINARIES 




“But the tale of history 
forms a very strong bulwark 
against the stream of time, 
and to some extent checks 
its irresistible flow, and, of 
all things done in it, as 
many as history has taken 
over, it secures and binds 
together, and does not allow 
them to slip away into the 
abyss of oblivion.” 

Anna Comnena, The Alexiad 

I HISTORICAL 
INTRODUCTION 


Physics is not a finished logical system. Rather, at any moment it spans a 
great confusion of ideas, some that survive like folk epics from the heroic periods 
of the past, and others that arise like utopian novels from our dim premonitions 
of a future grand synthesis. The author of a book on physics can impose order on 
this confusion by organizing his material in either of two ways: by recapitulating 
its history, or by following his own best guess as to the ultimate logical structure 
of physical law. Both methods are valuable; the great thing is not to confuse 
physics with history, or history with, physics. 

This book sets out the theory of gravitation according to what I think is its 
inner logic as a branch of physics, and not according to its historical development. 
It is certainly a historical fact that when Albert Einstein was working out general 
relativity, there was at hand a preexisting mathematical formalism, that of 
Riemannian geometry, that he could and did take over whole. However, this 
historical fact does not mean that the essence of general relativity necessarily 
consists in the application of Riemannian geometry to physical space and time. 
In my view, it is much more useful to regard general relativity above all as a theory 
of gravitation , whose connection with geometry arises from the peculiar empirical 
properties of gravitation, properties summarized by Einstein’s Principle of the 
Equivalence of Gravitation and Inertia. For this reason, I have tried throughout 
this book to delay the introduction of geometrical objects, such as the metric, the 
affine connection, and the curvature, until the use of these objects could be 
motivated by considerations of physics. The order of chapters here thus bears very 
little resemblance to the order of history. 
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Nevertheless, because we must not allow the history of physics “to slip away 
into the abyss of oblivion,” this first chapter presents a brief backward look at 
three great antecedents to general relativity — non-Euclidean geometry, the 
Newtonian theory of gravitation, and the principle of relativity. Their history is 
traced up to 1916, the year in which they were brought together by Einstein in the 
General Theory of Relativity. 1 


1 History of Non-Euclidean Geometry 

Euclid showed in his Elements 2 how geometry could be deduced from a few 
definitions, axioms, and postulates. These assumptions for the most part dealt 
with the most fundamental properties of points, lines, and figures, and seem as 
self-evident to schoolboys in the twentieth century as they did to Hellenistic 
mathematicians in the third century b.c. However, one of Euclid’s assumptions 
has always seemed a little less obvious than the others. The fifth postulate states 

“If a straight line falling on two straight lines make the interior angles on the 
same side less than two right angles, the two straight lines if produced indefinitely 
meet on that side on which the angles are less than two right angles.” 

For two thousand years geometers tried to purify Euclid’s system by proving that 
the fifth postulate is a logical consequence of his other assumptions. Today we 
know that this is impossible. Euclid was right, there is no logical inconsistency in a 
geometry without the fifth postulate, and if we want it we will have to put it in at 
the beginning rather than prove it at the end. However, the struggle to prove the 
fifth postulate is one of the great success stories in the history of mathematics, 
because it ultimately gave birth to modern non-Euclidean geometry. 

The list of those who hoped to prove the fifth postulate as a theorem includes 
Ptolemy (d. 168), Proclos (410-485), Nasir al din al Tusi (thirteenth century), 
Levi ben Gerson (1288-1344), P. A. Cataldi (1548-1626), Giovanni Alfonso Borelli 
(1608-1679), Giordano Vitale (1633-1711), John Wallis (1616-1703), Geralamo 
Saccheri (1667-1733), Johann Heinrich Lambert (1728-1777), and Adrien Marie 
Legendre (1752-1833). Without exception, their efforts only succeeded in replacing 
the fifth postulate with some other equivalent postulate, which might or might not 
seem more self-evident, but which in any case could not be proved from Euclid’s 
other postulates either. Thus, the Athenian neo-Platonist Proclos offered the 
substitute postulate: “If a straight line intersects one of two parallels, it will 
intersect the other also.” (That is, if we define parallel lines as straight lines that 
do not intersect however far extended, then there can be at most one line that 
passes through any given point and is parallel to a given line.) John Wallis, 
Savillian Professor at Oxford, showed that Euclid’s fifth postulate could be 
replaced v r ith the equivalent statement “Given any figure there exists a figure, 
similar to it, of any size.” And Legendre proved the equivalence of the fifth 
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postulate with, the statement “There is a triangle in which the sum of the three 
angles is equal to two right angles.” 3 

The attempt to dispense with Euclid’s fifth postulate began to take a different 
direction in the eighteenth century. In 1733 the Jesuit Geralamo Saccheri published 
a detailed study of what geometry would be like if the fifth postulate were false. 
He particularly examined the consequences of what he called the “hypothesis of 
the acute angle,” that is, that “a straight line being given, there can be drawn a 
perpendicular to it and a line cutting it at an acute angle, which do not intersect 
each other.” 3 However, Saccheri did not really think that this is possible; he still 
believed in the logical necessity of the fifth postulate, and explored non- Euclidean 
geometry only in the hope of eventually turning up a logical contradiction. 
Similar tentative explorations of non- Euclidean geometry were begun by Lambert 
and Legendre. 

It seems to have been Carl Friedrich Gauss (1777-1855) who first had the 
courage to accept non- Euclidean geometry as a logical possibility. His gradual 
enlightenment is recorded in a series of letters 4 to W. Bolyai, Olbers, Schumacher, 
Gerling, Taurinus, and Bessel, extending from 1799 to 1844. In a letter dated 1824 
he begged Taurinus to keep silent about the “heretical opinions” he had revealed. 
Gauss even went to the extent of surveying a triangle 40 in the Harz mountains 
formed by Inselberg, Brocken, and Hoher Hagen to see if the sum of its interior 
angles was 180°! (It was.) Then, in 1832, Gauss received a letter from his friend 
Wolfgang Bolyai, describing the non-Euclidean geometry developed by his son, 
Janos Bolyai (1802-1860), an Austrian army officer. He subsequently also learned 
that a professor in the Kazan, Nikolai Ivanovich Lobachevski (1793-1856), had 
obtained similar results in 1826. 

Gauss, Bolyai, and Lobachevski had independently discovered what in modern 
terms is called the two-dimensional space of constant negative curvature . Such spaces 
are still very interesting : we shall see in the chapter on cosmography that the space 
in which we actually live may be a three-dimensional space of constant curvature. 
But to its discoverers the important thing about their new geometry was that it 
describes an infinite two-dimensional space in which all of Euclid’s assumptions 
are satisfied — except the fifth postulate! In this it is unique, which perhaps 
explains why it was discovered more or less independently in Germany, Austria, 
and Russia. (The surface of a sphere also satisfies Euclidean geometry without the 
fifth postulate, but being finite it does not have room for parallel lines.) We shall 
see in Chapter 13, on symmetric spaces, that the two-dimensional space of constant 
negative curvature cannot be realized as a surface in ordinary three-dimensional 
Euclidean space, which is doubtless why it took two millennia to find it. And of 
course it also violates the alternative “common-sense” versions of Euclid’s fifth 
postulate given by Proclos, Wallis, and Legendre — through a given point there 
can be drawn infinitely many lines parallel to any given line ; no figures of different 
size are similar; and the sum of the angles of any triangle is less than 180°. 

However, it still remained an open possibility that Euclid’s fifth postulate 
could be derived from the others, for it was not at all obvious that the geometry of 
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Gauss, Bolyai, and Lobachevski did not contain a logical inconsistency. The usual 
way to “prove" that a system of mathematical postulates is self-consistent is to 
construct a model that satisfies the postulates out of some other system whose 
consistency is (for the moment) unquestioned. Bor both Euclidean and non- 
Euclidean geometry the “model” is provided by the theory of real numbers. 
Descartes’ analytic geometry shows that if a point is identified with a pair of real 
numbers (x 1 . x 2 ) and the distance between two points (aq, x 2 ) and (X l5 X 2 ) is 
identified as [(aq — XJ 2 -f (x 2 — X 2 ) 2 ] 1/2 , then all of Euclid’s postulates can be 
proved as theorems about real numbers. In 1870 a similar analytic geometry 5 was 
constructed by Felix Klein (1849-1925) for the geometry of Gauss, Bolyai, and 
Lobachevski — a “point” is represented as a pair of real numbers aq, x 2 with 

X l 2 + X 2 2 < 1 ( 1 . 1 . 1 ) 


and the distance d(x, X) between two points x, X is defined by 


cosh 


d(x, X) 
a 


1 — aqXj 




» 2 2 ) 1/2 (i 


X, 2 - X 2 2 ) l < 2 


( 1 . 1 . 2 ) 


where a is a fundamental length which sets the scale of the geometry. Note that 
this space is infinite, because d(x, X) -> oo as J/ + X 2 2 approaches unity. 
With this definition of “point” and “distance” one can verify that this model 
satisfies all of Euclid’s postulates except the fifth, and in fact obeys the geometry 
discovered by Gauss, Bolyai, and Lobachevski. Thus after two millennia the logical 
independence of Euclid’s fifth postulate was at last established. 

This was just the beginning of the development of non-Euclidean geometry. 
We saw that in order to discover the geometry of Gauss, Bolyai, and Lobachevski 
it was necessary to give up the idea that a curved surface could only be described 
in terms of its embedding in ordinary three-dimensional spaces. How then can we 
describe and classify curved spaces ? To pick up our story we must go back to 1827 
when Gauss published his Disquisitionesgenerales circa superficies curvas. Gauss for the 
first time distinguished the inner properties of a surface, that is, the geometry expe- 
rienced by small flat bugs living in the surface, from its outer properties, that is, its 
embedding in a higher-dimensional space, and he realized that it is the inner properties 
of surfaces that are “most worthy of being diligently explored by geometers.” 

Gauss also realized that the essential inner property of any surface is the 
metric function d(x, X), which gives the distance between x and X along the 
shortest path between them on the surface. For instance, a cone or a cylinder has 
the same local inner properties as a plane, since a plane can be rolled without 
stretching or tearing (i.e., without distorting metric relations) into a cone or a 
cylinder. On the other hand, all cartographers know that a sphere cannot be 
unrolled onto a plane surface without distortion, and thus its local inner properties 
are not the same as the plane’s. 

There is a simple example that has been used by Einstein, Wheeler, and others 
to illustrate how the inner properties of a surface can be discovered by exploring 
its metric. (See Figure 1.1.) Consider N points in a plane. We can use one point as 
an origin of coordinates and draw an #-axis through a second point, so that the 
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distances between the various points are described in terms of (2N — 3) co- 
ordinates, that is, the ^-coordinate of the second point and the x- and ^-coordinates 
of the remaining (N — 2) points. But there are N(N — l)/2 different distances 
between the N points, and thus for large enough N these distances must be 
subject to M algebraic relations, where 


2 ){N - 3) 

2 


(1.1.3) 


For instance, in the simplest interesting case, N = 4, we can easily show that the 
distances d mn between points m and n satisfy the single relation 
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This relation will be satisfied on any simply connected patch of a cylinder or a 
cone, which share the same inner properties as the plane, but it will not be satisfied 
by a table of airline distances among any four cities, because the earth’s surface has 
different inner properties. There is a different relation appropriate to spherical 
surfaces, which is satisfied by airline mileage tables, and can be used to measure the 
radius of the earth. Of course, this is not the most convenient method and it is not 
the method used by Eratosthenes, but the important point here is that the curva- 
ture of the earth’s surface can be determined from its local inner properties. 

Were our imaginations given free rein, we could conceive of a great variety of 
peculiar metric functions d(x, X). It was Gauss’s great contribution to pick out one 
particular class of metric spaces, which was broad enough to include the space 
of Gauss, Bolyai, and Lobachevski as well as that of ordinary curved surfaces, but 
narrow enough to deserve the name of geometry. Gauss assumed that in any 
sufficiently small region of the space it would be possible to find a locally Euclidean 
coordinate system (f l9 £ 2 ) so that the distance between two points with coordinates 
(fi, f 2 ) and (fj + d£ l3 £ 2 + dc 2 ) satisfies the law of Pythagoras, 

ds 2 — dc, 2 + d^ 2 2 (1.1.5) 

For instance, we can set up such a locally Euclidean coordinate system at any 
point in an ordinary smooth curved surface by using the Cartesian coordinates of a 
plane tangent to the surface at the given point. However, this should not make us 
suppose that Gauss’s assumption has anything to do with outer properties; it 
deals only with inner metric relations for infinitesimal neighborhoods. 

If a surface is not Euclidean, it will not be possible to cover any finite part of 
it with a Euclidean coordinate system (f l9 £ 2 ) satisfying the law of Pythagoras. 
Suppose that we use some other coordinate system (x l3 x 2 ) that does cover the 
space, and ask what form Gauss’s assumption takes in these coordinates. It is easy 
to calculate that the distance ds between points (x l3 x 2 ) and {x x + dx 1 , x 2 + dx 2 ) 
is given by 

ds 2 = g li {x l5 x 2 ) dx t 2 + 2g 12 {x l , x 2 ) dx 1 dx 2 -f g 22 {x l , x 2 ) dx 2 2 (1.1.6) 
where 



This form for ds 2 is the hallmark of a metric space . [We shall see in Chapter 3 that 
this derivation can be reversed; given any space with ds given by (1.1.6), we can 
at any point choose locally Euclidean coordinates f 1? q 2 satisfying (1.1.5).] For the 
case of a sphere of radius a we can use spherical polar coordinates 6 , cp, and the 
metric is 

9ee = a 2 , 9ov = o, g„ = a 2 sin 2 0 (1.1.8) 
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It is the factor sin z 6 in g V9 that gives a sphere different inner properties from a 
plane. In the geometry of Gauss, Bolyai, and Lobachevski,- we can use the co- 
ordinates x 1; x 2 of Klein’s model, and find from the posited formula for d(x, X) 
that 


a 2 (l ~ * 2 2 ) 

(1 - aq 2 - x 2 2 ) 2 


9x2 — 


(i - xf - xfy 


9 22 


a 2 { 1 — xf) 

(1 _ ^2 _ X 2) 2 


( 1 . 1 . 9 ) 


The length of any path can be determined by integrating ds along the path. 

The metric functions g tj determine all inner properties of a metric space, but 
they also depend on how we choose the coordinate mesh. For instance, we can use 
polar coordinates r, 6 to describe a plane surface, and find that the metric functions 


This does not look like a Euclidean space, but of course it is, as we can show 
formally by transforming to Cartesian coordinates x = r cos 0, y = r sin 0. 
More generally, a change of coordinates from (aq, x 2 ) to (x[, x 2 ) will change the 
metric functions g t j to gf where, for instance, 


= ( 8 AY + ( S AX 

9 11 \8x'J \dx'J 


+ dtiteiY + ( s Az tel + 8 Ai 


d£i Sx 1 | 8 x 2 

dx t dx[ dx 2 dx[ 


dx t dx[ dx 2 dx[ 


(dx A \ 2 ^ dx, 8x 2 (dx 2 

+ 912 fci fci + 9l1 Wl 


(1.1.11) 


How then can we tell the inner properties of a space by looking at its metric 
coefficients ? What we need is some function of the g tj and their derivatives that 
depends only on the inner properties of the space and not, like the g ij: also on the 
particular coordinate system chosen to describe the space. 

Gauss found this function, and found it to be essentially unique; it is the 
so-called Gaussian curvature: 
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where g is the determinant 


g{x it x 2 ) = g u g 22 - Q l2 2 

(The reader should not quail at the awful appearance of this formula. After 
introducing a certain amount of mathematical formalism, we shall be able to 
derive and discuss the curvature in a far more compact and elegant notation, in 
Chapter 6.) By applying Eq. (1.1.12) to the metric functions (1.1.8) and (1.1.9), we 
find that the surface of a sphere is a space of constant positive curvature 

K = — (sphere) (1.1.13) 

a 2 

whereas the space of Gauss, Bolyai, and Lobachevski has constant negative 
curvature 

K=-\ (G-B-L) (1.1.14) 

a 

(Incidentally, there is nothing very exotic about negative curvature ; an ordinary 
saddle is negatively curved. It is the constancy of K that makes the geometry of 
Gauss, Bolyai, and Lobachevski unrealizable for ordinary curved surfaces. It is also 
obvious that only with K constant could the other postulates of Euclid be satisfied, 
because these other postulates describe an intrinsically homogeneous space, 
whereas if K varied from point to point then the inner properties of the space 
would vary with it.) Finally, if we apply our formula for K to the metric (1.1.10) 
that describes a plane in polar coordinates, then we find 

K = 0 (plane) (1.1.15) 

as of course we must. Thus, however perverse we are in our choice of coordinate 
system, the inner properties of a space can still be revealed by the straightforward 
procedure of calculating K. 

Having come so far, it was not long before mathematicians turned to the 
problem of describing the inner properties of curved spaces having three or more 
dimensions. It was not a trivial matter to expand the work of Gauss to more than 
two dimensions, because the inner properties of such spaces cannot be described 
by a single curvature function K. In D dimensions there will be D{D -f l)/2 
independent metric functions g tj , and our freedom to choose the D coordinates at 
will allows us to impose D arbitrary functional relations on the g tj , leaving C 
functions that truly express the inner properties of the space, where 

c = D(D + 1 ) _ D = D(D - 1 ) 

2 2 

Fori) = 2 , C = l,as found by Gauss. For D > 2, C > 1, and the description of 
the geometry becomes much more complicated. This problem was completely 
solved in 1854 by Georg Friedrich Bernhard Riemann (1826-1866), who presented 
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what we now call Riemannian geometry in his Gottingen inaugural lecture, Uber 
die Hypothesen, welche der Geometrie zu Grunde liegen. Subsequent work by 
Christoffel, Ricci, Levi-Civita, Beltrami, and others developed Riemann’s ideas 
into the beautiful mathematical structure described in our chapters on tensor 
analysis and curvature. However, it remained for Einstein to see the use physics 
could make of non- Euclidean geometry. 


2 History of the Theory of Gravitation 

At the end of the Principia, Isaac Newton (1642-1727) described gravitation 
as a cause that operates on the sun and planets “according to the quantity of solid 
matter which they contain and propagates on all sides to immense distances, 
decreasing always as the inverse square of the distances.” 6 There are two parts to 
Newton's law, which were discovered in different ways, and which played different 
roles in the development of mechanics from Newton to Einstein. 

It was of course Galileo Galilei (1564-1642) who discovered that bodies fall at a 
rate independent of their mass. His tools were an inclined plane to slow the fall, 
a water clock to measure its duration, and also a pendulum, to avoid rolling friction. 
These observations were later improved by Christaan Huygens (1629-1695). 
Newton could thus use his second law to conclude that the force exerted by 
gravitation is proportional to the mass of the body on which it acts ; the third 
law then ensures that the force is also proportional to the mass of its source. 

Newton was well aware that these conclusions might be only approximately 
true, and that the “inertial mass” entering in his second law might not be precisely 
the same as the “gravitational mass” appearing in the law of gravitation. If this 
were the case, we would have to write Newton’s second law as 

F = m £ a (1.2.1) 

and write the law of gravitation as 

F = m g g (1.2.2) 

where g is a field depending on position and other masses. The acceleration at a 
given point would be 

(1.2.3) 

and would be different for bodies with different values for the ratio mjm^ in 
particular pendulums of equal length would have periods proportional to 
{mjmji 112 . Newton tested this possibility by experiments with pendulums of 
equal length but different composition, and found no difference in their periods. 
This result was later verified more accurately by Friedrich Wilhelm Bessel (1784- 
1846) in 1830. Then, in 1889, Roland von Eotvos 7 succeeded by a different method 
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in showing that the ratio m g jm i does not differ from one substance to another by 
more than one part in 10 9 . (See Figure 1.2.) Eotvos hung two weights A and B 
from the ends of a 40-cm beam suspended on a fine wire at its center. At equilibrium 
the beam would sag in such a way that 

l A( m gA9 - ™ia9z) = ~ m iB g ' z ) (1.2.4) 


m . 

t 


Ss 



Figure 1.2 Schematic view of the Eotvos experiment. 


where g is the earth’s gravitational field, g' z is the vertical component of the centri- 
petal acceleration due to the earth’s rotation, and l A and l B are the effective lever 
arms for the two weights. [Of course Eotvos chose weights and lever arms to be 
nearly equal, but the point of his method is that even if A is a little bigger than B, 
the beam will still sag just so as to make (1.2.4) correct.] At the latitude of Budapest 
the centripetal acceleration due to the earth’s rotation also has an appreciable 
horizontal component g' s , giving to the balance a torque around the vertical axis 
equal to 

T = l A m \A9s - 


Using the equilibrium condition to determine l B , we have then 


T — l A m iA g s 


fa 1- 

\m iA ) \m 


iB 


or, since g' z is much less than g , 

t = i A g' s ™ g A 
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Any inequality in the ratios for the two weights would thus tend to twist 

the wire from which the balance was suspended. No twist was detected, and 
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Eotvos concluded from this that the difference of mjm g for wood and platinum 
was less than 10” 9 . 

Einstein was very impressed with the observed equality of gravitational and 
inertial mass 8 , and as we shall see, it served him as a signpost toward the Principle 
of Equivalence. (It also sets very stringent limits on any possible nongravitational 
forces that might exist. For instance, any new kind of electrostatic force in which 
the number of nucleons plays the role of charge would have to be much weaker 
than gravitation. y ) In recent years a group under R. H. Dicke lu at Princeton has 
improved on Eotvos’ method, by using the gravitational field of the sun and the 
earth’s centripetal acceleration toward the sun, rather than the rotation of the 
earth, to produce the torque on the balance. The advantage is that the angle 
between the direction of the sun and the balance arm changed with a 24-hr period, 
and so Dicke could filter out of his data any noise not at the diurnal frequency. In 
this way he concluded that “aluminum and gold fall toward the sun with the same 
acceleration, the accelerations differing from each other by at most one part in 
10. 11 ” It has also been shown (with very much less precision) that neutrons fall 
with the same acceleration as ordinary matter, 1 1 and that the gravitational force 
on electrons in copper is the same as on free electrons. 1 2 

We now move on to the second part of Newton’s law of gravitation, which says 
that the force decreases as the inverse square of the distance. This idea was not 
entirely original with Newton. Johannus Scotus Erigena (c. 800-c. 877) had guessed 
that heaviness and lightness vary with distance from the earth. This theory was 
taken up by Adelard of Bath (twelfth century), who realized that a stone dropped 
into a very deep well could fall no farther than the center of the earth. (Incidentally, 
Adelard also translated Euclid from Arabic into Latin, thus making it available to 
medieval Europe.) The first suggestion of an inverse-square law may have been 
made around 1640 by Ismael Bullialdus (1605-1694). However, it was certainly 
Newton who in 1665 or 1666 first deduced the inverse-square law from observa- 
tions. He knew that the moon falls toward the earth a distance 0.0045 ft. each 
second, and he knew that the moon is 60 earth radii away from the center of the 
earth. Hence, if the gravitational force obeys an inverse- square law, then an apple 
in Lincolnshire (which is 1 earth radius away from the center of the earth) should 
fall in the first second 3600 times 0.0045 ft, or about 16 ft, in good agreement with 
the measured value. However, Newton did not publish this calculation for twenty 
years, because he did not know how to justify the fact that he had treated the earth 
as if its whole mass were concentrated at its center. Meanwhile, it became known to 
several members of the Royal Society, including Edmund Halley (1656-1742), 
Christopher Wren (1632-1723), and Robert Hooke (1635-1703), that Kepler’s third 
law would imply an inverse-square law of force if the orbits of planets were circular. 
That is, if the squares of the periods, r 2 jv 2 , are proportional to the cubes of the 
radii r 3 , then the centripetal acceleration v 2 jr is proportional to 1/r 2 . However, the 
planets actually move on ellipses, not circles, and no one knew how to calculate 
their centripetal acceleration. Under Halley’s instigation, Newton in 1684 proved 
that planets moving under the influence of an inverse-square-law force would 
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indeed obey all the empirical laws of Johannes Kepler (1571-1630); that is, they 
would move on ellipses with the sun at a focus, they would sweep out equal areas in 
equal times, and the square of their periods would be proportional to the cube of 
their major axes. Finally, in 1685, Newton was able to complete his lunar calcula- 
tion of 1665. These stupendous accomplishments were published on July 5, 1686, 
under the title Philosophiae Naturalis Principia MathemMica . 1 3 

In the following centuries Newton’s law of gravitation met with a brilliant 
series of successes in explaining the motion of the moon and planets. Some 
irregularities in the orbit of Uranus remained unexplained until, in 1846, they were 
independently used by John Couch Adams (1819-1892) in England and Urbain 
Jean Joseph LeVerrier (1811-1877) in France to predict the existence and position 
of Neptune. The discovery of Neptune shortly thereafter was perhaps the most 
splendid verification of Newton’s theory. The motion of the moon and Encke’s 
comet (and, later, Halley’s comet) still showed departures from Newtonian theory, 
but it was clear that nongravitational forces could be at work. 

One problem remained. A year before his prediction of Neptune, LeVerrier 
had calculated that the observed precession of the perihelia of Mercury was 
35"/century faster than what would be expected according to Newton’s theory 
from the known perturbing fields of the other planets. This discrepancy was 
confirmed in 1882 by Simon Newcomb (1835-1909), who gave a value of 43" for 
the excess centennial precession. 14 LeVerrier had thought that this excess was 
probably due to a group of small planets between Mercury and the sun, but after a 
careful search none were discovered. Newcomb then suggested that perhaps the 
matter responsible for the faint “zodiacal light” seen in the plane of the ecliptic 
of the solar system was also responsible for the excess precession of Mercury. 
However, his calculations showed that the amount of matter needed to account 
for the precession of Mercury would, if placed in the plane of the ecliptic, produce a 
rotation of the plame of the orbits (that is, a precession of the nodes) of both 
Mercury and Venus different from what had been observed. For this reason, 
Newcomb was led by 1895 “to drop these explorations as unsatisfactory, and to 
prefer provisionally the hypothesis that the Sun’s gravitation is not exactly as the 
inverse square.” 15 

Unfortunately this was not the last word. In 1896 H. H. Seeliger constructed 
an elaborate model of the zodiacal light, placing the matter responsible on ellipsoids 
close to the sun, which could account for the excess precession of Mercury without 
upsetting the agreement between theory and experiment for the rotation of the 
planes of the inner planets’ orbits. Today we know that this model is totally wrong, 
and that there simply is not enough interplanetary matter to account for the 
observed excess precession of Mercury. However, Seeliger’s hypothesis, together 
with the continued success of Newtonian theory elsewhere, convinced Newcomb 
that there was no need to alter the law of gravitation. 1 5 

I do not know whether Einstein was very much influenced, in creating general 
relativity, by the problem of the precession of Mercury’s perihelia. However, there 
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is no doubt that the first confirmation of his theory was that it predicted an excess 
precession of precisely 43"/century. 


3 History of the Principle of Relativity 

Newtonian mechanics defined a family of reference frames, the so-called 
inertial frames , within which the laws of nature take the form given in the Principia. 
For instance, the equations for a system of point particles interacting gravitationally 
are 


m N 


dt 2 


q ^ ~ * N ) 

M l X M _ X A T I 3 


(1.3.1) 


where m N is the mass of the Nth particle and x N is its Cartesian position vector 
at time t. It is a simple matter to check that these equations take the same form 
when written in terms of a new set of space-time coordinates : 


x ' = Px + vt + d 

t' = t + T 


(1.3.2) 


where v, d, and t are any real constants, and P is any real orthogonal matrix. (If 0 
and 0' use the unprimed and primed coordinate system, respectively, then O' sees 
the 0 coordinate axes rotated by P , moving with velocity v, displaced at t — 0 by 
d, and O' sees the 0 clock running behind his own by a time t.) The transforma- 
tions (1.3.2) form a 10-parameter group (three Euler angles in P, plus three com- 
ponents each for v and d, plus one t) today called the Galileo group , and the 
invariance of the laws of motion under such transformations is today called 
Galilean invariance, or the Principle of Galilean Relativity. 

What really impressed Newton about all this was that there are a great many 
more transformations that do not leave the equations of motion invariant. For 
instance, (1.3.1) does not retain its form if we transform into an accelerating or a 
rotating coordinate system, that is, if we let v or P depend on t. The equations of 
motion can hold in their usual form in only a limited class of coordinate systems, 
called inertial frames. What then determines which reference frames are inertial 
frames ? Newton answered that there must exist an absolute space, and that the 
inertial frames were those at rest in absolute space, or in a state of uniform motion 
with respect to absolute space. In his words 16 . 

“Absolute space, in its own nature and with regard to anything external, always 
remains similar and unmovable. Relative space is some movable dimension or 
measure of absolute space, which our senses determine by its position with 
respect to other bodies, and is commonly taken for absolute space.” 
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Newton also described several experiments that demonstrated what he inter- 
preted as the effects of rotation with respect to absolute space. The most famous is 
the rotating bucket 17 : 

“If a bucket, suspended by a long cord, is so often turned about that finally 
the cord is strongly twisted, then is filled with water, and held at rest together 
with the water; and afterwards by the action of a second force, it is suddenly 
set whirling about the contrary way, and continues, while the cord is untwisting 
itself, for some time in this motion ; the surface of the water will at first be level, 
just as it was before the vessel began to move ; but subsequently the vessel, by 
gradually communicating its motion to the water, will make it begin sensibly to 
rotate, and the water will recede little by little from the middle and rise up at the 
sides of the vessel; its surface assuming a concave form. {This experiment I 
have made myself.) ... At first, when the relative motion of the water in the vessel 
was greatest, that motion produced no tendency whatever of recession from the 
axis, the water made no endeavor to move upwards towards the circumference, 
by rising at the sides of the vessel, but remained level, and for that reason its 
true circular motion had not yet begun. But afterwards, when the relative 
motion of the water had decreased, the rising of the water at the sides of the 
vessel indicated an endeavor to recede from the axis ; and this endeavor reveals 
the real circular motion of the water, continually increasing till it had reached its 
greatest point, when relatively the water was at rest in the vessel. ...” 


Newton’s conception of absolute space was rejected by his great opponent 
Gottfried Wilhelm von Leibniz (1646-1716), who argued that there is no philo- 
sophical need for any conception of space apart from the relations of material 
objects. The issue was debated in a famous series of letters 18 (1715-1716) between 
Leibniz and Newton’s supporter, Samuel Clarke (1675-1729), and philosophers 
continued the argument, with Newton’s position defended by Leonhard Euler 
(1707-1783) and Immanuel Kant (1724-1804) and attacked by Bishop George 
Berkeley (1685-1753) in his Principles of Human Knowledge (1710) and Analyst 
(1734). Of course none of this high-minded metaphysics led to any idea about how 
to develop a dynamical theory that might replace Newton’s. 

The first constructive attack on Newtonian absolute space was launched in the 
1880’s by the Austrian philosopher Ernst Mach (1836-1916). In his book Die 
Mechanik in ihrer Entwicklung 19 he remarks that 

“Newton’s experiment with the rotating vessel of water simply informs us, that 
the relative rotation of the water with respect to the sides of the vessel produces 
no noticeable centrifugal forces, but that such forces are produced by its relative 
motion with respect to the mass of the Earth and the other celestial bodies. No 
one is competent to say how the experiment would turn out if the sides of the 
vessel increased in thickness and mass until they were several leagues thick.” 
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The hypothesis, that there is some influence of the mass of the Earth and the 
other celestial bodies” which determines the inertial frames, is called Mach's 
'principle. 

There is a simple experiment that anyone can perform on a starry night, to 
clarify the issues raised by Mach’s principle. First stand still, and let your arms 
hang loose at your sides. Observe that the stars are more or less unmoving, and 
that your arms hang more or less straight down. Then pirouette. The stars will 
seem to rotate around the zenith, and at the same time your arms will be drawn 
upward by centrifugal force. It would surely be a remarkable coincidence if the 
inertial frame, in which your arms hung freely, just happened to be the reference 
frame in which typical stars are at rest, unless there were some interaction between 
the stars and you that determined your inertial frame. 

This argument can be made more precise. The surface of the earth is not 
exactly an inertial frame, and of course the rotation and revolution of the earth 
give the stars an apparent motion, but these effects can be eliminated by using the 
inertial frame defined by the solar system as a whole. In this inertial frame of 
reference the average observed rotation of the galaxies with respect to any axis 
through the sun is less than about 1 arc -sec /century ! 20 

We seem to be faced with an unavoidable choice : Either we admit that there 
is a Newtonian absolute space-time, which defines the inertial frames and with 
respect to which typical galaxies happen to be at rest, or we must believe with 
Mach that inertia is due to an interaction with the average mass of the universe. 
And if Mach is right, then the acceleration given a particle by a given force ought 
to depend not only on the presence of the fixed stars but also, very slightly, on the 
distribution of matter in the immediate vicinity of the particle. We shall see in 
Chapter 3 that Einstein’s equivalence principle gives an answer to the problem of 
inertia that does not refer to a Newtonian absolute space and yet does not quite 
agree with the conclusions of Mach. The issue is not closed. 

I have not yet mentioned special relativity because, despite its name, it really 
does not affect the antinomy between absolute and relative space. However, we 
shall have to formulate the equivalence principle in special-relativistic terms, so a 
detailed review of special relativity is presented in the next chapter; for the 
moment we only take a glance at its history. 

The theory of electrodynamics presented in 1864 by James Clark Maxwell 
(1831-1879) clearly did not satisfy the principle of Galilean relativity. For one 
thing, Maxwell’s equations predict that the speed of light in vacuum is a universal 
constant c, but if this is true in one coordinate system x l , t, then it will not be true 
in the “moving” coordinate system x n , t' defined by the Galilean transformation 
(1.3.2). Maxwell himself thought that electromagnetic waves were carried by a 
medium, 21 the luminiferous ether, so that his equations would hold in only a 
limited class of Galilean inertial frames, that is, in those coordinate frames at rest 
with respect to the ether. 

However, all attempts to measure the velocity of the earth with respect to the 
ether failed, 22 even though the earth has a velocity of 30 km/sec relative to the 
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sun, and about 200 km/sec relative to the center of our galaxy. The most important 
experiment was that of Albert Abraham Michelson (1852-1931) and E. W. 
Morley, 23 which showed in 1887 that the velocity of light is the same, within 
5 km/sec, for light traveling along the direction of the earth’s orbital motion and 
transverse to it. The accuracy of this result has been recently improved to about 
1 km/sec. 24 

The persistent failure of experimentalists to discover effects of the earth’s 
motion through the ether led theorists, including George Francis Fitzgerald 25 
(1851-1901), Hendrik Antoon Lorentz 26 (1853-1928), and Jules Henri Poincare 27 
(1854-1912) to suggest reasons why such “ether drift” effects should be in principle 
unobservable. (See Figure 1.3.) Poincare in particular seems to have glimpsed the 
revolutionary implications that this would have for mechanics, and Whittaker 28 
gives the credit for special relativity to Poincare and Lorentz. Without entering 
this controversy, 2 9 it is safe to say that a comprehensive solution to the problems 


CONSEIL DE PHYSIQUE SOLVAY 

BRUXELLES 1911 



Photu CoupnC, Bryx^lles 

GOLDSCHMIDT PLANCK RUBENS LINDEMANN HASENOHRL 

NEftNST BRILLOUIN SOMMERFELD DE BROGLIE HOSTELET 

SOLVAY KNUDSEN HERZEN JEANS RUTHERFORD 

LORENTZ WARBURG WIEN EINSTEIN LANGEVJN 

PERRIN Madcme CURIE POINCAFtE KAjvOUNGH ONNES 

Figure 1.3 Founders of the Special Theory of Relativity, at the First Solvay Con- 
ference in 1911. 
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of relativity in electrodynamics and mechanics was first set out in detail in 1905 
by Albert Einstein 30 (1879-1955). 

Einstein proposed that the Galilean transformation (1.3.2) should be 
replaced with a different 10-parameter space-time transformation, called a Lorentz 
transformation, that does leave Maxwell’s equations and the speed of light in- 
variant. (It is not clear that Einstein was directly influenced by the Michelson- 
Morley experiment itself, 31 but he specifically refers to “the unsuccessful attempts 
to discover any motion of the earth relative to the light medium’” in his 1905 
paper. 32 ) The equations of Newtonian mechanics, such as Eq. (1.3.1), are not 
invariant under Lorentz transformations; therefore Einstein was led to modify 
the laws of motion so that they would be Lorentz-invariant. The new physics, 
consisting of Maxwell’s electrodynamics and Einstein’s mechanics, then satisfied a 
new principle of relativity, the Principle of Special Relativity, which says that all 
physical equations must be invariant under Lorentz transformations. These 
developments are discussed in detail in the next chapter. 

The Lorentz group of transformations is not in any way larger than the 
Galileo group, and therefore the principle of relativity was not originated by the 
special theory of relativity, but rather restored by it. Before Maxwell, it might have 
been supposed that all of physics is invariant under the Galileo group. Maxwell’s 
equations were not invariant under this group, and for half a century it appeared 
that only mechanics, not electrodynamics, obeys the principle of relativity. After 
Einstein, it was clear that the equations of both mechanics and electrodynamics 
are invariant, but with respect to Lorentz transformations, not Galileo trans- 
formations. The laws of physics in the form given them by Maxwell and Einstein 
could still only be true in a limited class of inertial reference frames, and the 
question of what determines these inertial frames was as mysterious after 1905 
as in 1686. 

It remained to construct a relativistic theory of gravitation. A crucial step 
toward this goal was taken in 1907, when Einstein introduced the Principle of 
Equivalence of Gravitation and Inertia, 33 and used it to calculate the red shift of 
light in a gravitational field. As we shall see in Chapter 3, this principle determines 
the effects of gravitation on arbitrary physical systems, but it does not determine 
the field equations for gravitation itself. Einstein tried to use the equivalence 
principle in 1911 to calculate the deflection of light in the sun’s gravitational 
field, 34 but the structure of the field was not then correctly understood, and 
Einstein’s answer was one-half the “correct” general -relativistic result, derived 
here in Chapter 8. A number of attempts were made in 1911-1912 by Einstein. 33 
Abraham, 36 and Nordstrom 37 to construct relativistic field equations for a single 
scalar gravitational field, but Einstein soon became dissatisfied with all such 
theories, largely on aesthetic grounds. (The gravitational deflection of light by the 
sun had not yet been measured.) A collaboration with the mathematician Marcel 
Grossman led Einstein by 1913 to the view 38 that the gravitational field must be 
identified with the 10 components of the metric tensor of Riemannian space-time 
geometry. As discussed in Chapters 4 and 5, the Principle of Equivalence is 
incorporated into this formalism through the requirement that the physical 
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equations be invariant under general coordinate transformations, not* just Lorentz 
transformations, though I do not know to what extent this “General Principle of 
Relativity” took on in Einstein’s mind a life of its own, apart from the Principle 
of Equivalence. During the next twp_years, Einstein presented to the Prussian 
Academy of Sciences a series of papers' - 9 in which he worked out the field equations 
for the metric tensor and calculated the gravitational deflection of light and the 
precession of the perihelia of Mercury. These magnificent achievements were 
finally summarized by Einstein in his 1916 paper, 1 titled “The Foundation of the 
General Theory of Relativity.” 
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“There are really four 
dimensions, three which we 
call the three planes of 
Space, and a fourth, Time. 
There is, however, a 
tendency to draw an unreal 
distinction between the 
former three dimensions 
and the latter, because it 
happens that our conscious- 
ness moves intermittently 
in one direction along the 
latter from the beginning 
to the end of our lives.” 
“‘That’, said a very young 
man, making spasmodic 
efforts to relight his cigar 
over the lamp ; ‘that . . . 
very clear indeed.’” H. G. 
Wells, The Time Machine 


2 SPECIAL 
RELATIVITY 


We now review Einstein’s Special Theory of Relativity. This chapter, while 
self-contained, is only a brief summary, and aims primarily at establishing our 
notation and collecting some formulas that will be useful later. The reader who 
needs a more extensive introduction to special relativity is advised to turn to one 
of the books listed at the end of this chapter, and then return. The reader who feels 
completely at home with the subject may find it desirable to move on immediately 
to Chapter 3. 


1 Lorentz Transformations 

The Principle of Special Relativity states that the laws of nature are invariant 
under a particular group of space-time coordinate transformations, called Lorentz 
transformations. We saw at the end of Chapter 1 that Newton’s laws of motion are 
invariant under the Galilean coordinate transformations (1.3.2), but that Maxwell’s 
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equations are not, and, that Einstein resolved this conflict by replacing Galilean 
invariance with Lorentz invariance. I shall not continue this discussion in historical 
terms, but shall simply define the Lorentz transformations, and then show how 
Lorentz invariance guides our search for the laws of nature. 

A Lorentz transformation is a transformation from one system of space -time 
coordinates x x to another system x fx , so that 


x ,x = Ay x p + a* 

(2.1.1) 

where a* and A 0 ^ are constants, restricted by the conditions 


A", a “t n aP = t] yi 

(2.1.2) 

with 


f + 1 a = = 1, 2, or 3 


%l) = { - 1 a = fS = 0 

(2.1.3) 

l 0 a # 0 



In our notation a, /?, y, and so on. will always run over the four values 1, 2, 3, 0, 
with x 1 , x 2 , x 3 the Cartesian components of the position vector x and x° the time t. 
We shall use natural units in which the speed of light is unity, so all x x have the 
dimension of length. Any index, like /? in Eq. (2.1.1), that appears twice, once as a 
subscript and once as a superscript, is understood to be summed over unless 
otherwise noted; that is, Eq. (2.1.1) is an abbreviation for 

a/* = A a 0 z° + A\ x 1 + A a 2 + A a 3 z 3 + a* 

The fundamental property that distinguishes the Lorentz transformations is 
that they leave invariant the “proper time” dr, defined by 

dr 2 = dt 2 — dx 2 = — t] txp dx x dx p (2.1.4) 

In a new coordinate system x' x , the coordinate differentials are given by (2.1.1) as 


dx'* = A x y dx y 


so the new coordinate time will be 

-flap dx' x dx'P 
~r Ui jA a y A p s dx y dx 3 
-rj yS dx y dx 3 

dr' 2 = dr 2 (2.1.5) 

It is this property that accounts for the observation by Michelson and Morley that 
the speed of light is the same in all inertial systems. A light wave front will have 
\dxjdt\ equal to the speed of light, which in our units is unity; hence the propaga- 
tion of light is described by the statement that 


dx' 2 = 


and therefore 


dr = 0 


( 2 . 1 . 6 ) 
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Performing a Lorentz transformation does not change dr, so dr' 2 = 0, and 
therefore \dx f /dt f \ = 1 ; that is, the speed of light in the new coordinate system is 
still unity. 

We can also show that the Lorentz transformations (2.1.1) are the only non- 
singular coordinate transformations x — ► x’ that leave dr 2 invariant. (Nonsingular 
means that x'(x) and x{x') are well-behaved differentiable functions, so that the 
matrix dx’^jdx p has a well-defined inverse dx^/dx'*.) A general coordinate trans- 
formation x -> x' will change dr into dr f , given by 


dr' 2 = —r} a p dx' a dx 


dx'* dx ,p 
^ dx? 8x s 


dx y dx 8 


If this is equal to dr 2 for all dx y , we must have 


*7 yd tfafS 


dx'* dx ,fi 
dx y dx 8 


(2.1.7) 


Differentiation with respect to x E gives 


0 = 


d 2 x'* dx'P 
dx y dx E dx 8 


+ Wap 


dx" d 2 x' p 


dx y dx 8 dx E 


To solve for the second derivatives, we add to this the same equation with y and s 
interchanged, and subtract the same with g and 5 interchanged ; that is, 


0 = 


~ d 2 x" dx ,p 
dx y dx E dx 8 


d 2 x ,fi dx ,a d 2 x'* dx ,f} 
dx 8 dx E d x* + dx E dx y dx 8 


d 2 x’ p dx'* 
dx 8 dx y dx E 

d 2 x '* dx’ p 
dx y dx 8 ~d^ 


d 2 x ,p dx’* 
dx E dx 8 dx y _ 


The last term cancels the second, the penultimate cancels the fourth (because 
ri a p = and the first equals the third, so we are left with 


0 = 2 ^ 


d 2 x'* dx ,fi 
dx y dx E dx 8 


But both rj a p and dx’Pjdx 5 are nonsingular matrices, so this immediately yields 

d 2 x'* 


0 = 


dx y dx 1 


( 2 . 1 . 8 ) 


The general solution of (2.1.8) is of course just the linear function (2.1.1), and by 
inserting (2.1.1) in (2.1.7) we see that A tt p must be subject to the condition (2.1.2). 
This proof is an elementary example of the sort of thing we do in Chapter 13, on 
symmetric spaces. (Incidentally, if we had only assumed that the transformations 
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x -> x' leave dr invariant when dr = 0, that is, for a particle moving at the speed 
of light, then we would have found that these transformations are in general 
nonlinear, and form a 15-parameter group, the conformal group, which contains 
the Lorentz transformations as a subgroup. But the statement that a free particle 
moves at constant velocity would not be an invariant statement unless the velocity 
were that of light, and since there are massive particles in the world, we must 
reject the conformal group as a possible invariance of nature.) 

The set of all Lorentz transformations of the form (2.1.1) is correctly called 
the inhomogeneous Lorentz group, or the Poincare group. The subset with a* — 0 
is called the homogeneous Lorentz group. Both the homogeneous and the inhomo- 
geneous Lorentz groups have subgroups called the proper homogeneous and 
inhomogeneous Lorentz groups, defined by imposing on A 0 ^ the additional 
requirements 

A° 0 >1; Det A = + 1 (2.1.9) 

Note from (2.1.2) that 

(A°o) 2 = 1 + S (Vo) 2 > 1 (2-1.10) 

i= 1,2,3 

and 

(Det A) 2 = 1 (2.1.11) 

[Equation (2.1.10) follows upon setting y = S = 0 in (2.1.2). Equation (2.1.11) 
is derived by writing Eq. (2.1.2) as a matrix equation rj = A T rjA and taking its 
determinant.] It follows that any A 3 ^ that can be converted to the identity 5^ 
by a continuous variation of its parameters must be a proper Lorentz transforma- 
tion, because it is impossible by a continuous change of parameters to jump from 
A% < -1 to A° 0 > +1, or from Det A = — 1 to Det A = +1, and the 
identity has A° 0 = +1 and Det A = + 1. The improper Lorentz transformations 
involve either space inversion (Det A = — 1,A° 0 > 1), which is now known not to 
be an exact symmetry of nature, 1 or time reversal (Det A = — 1, A° 0 < — 1), 
which is strongly suspected to be not an exact symmetry of nature, 2 or their 
product. We are dealing almost exclusively with proper Lorentz transformations, 
and unless otherwise noted, any Lorentz transformation is assumed to satisfy 
Eq. (2.1.9). 

The proper homogeneous Lorentz transformations have a further subgroup, 
consisting of the rotations, for which 

A', = R u , A'o = A° ; = 0, A° 0 = 1 

where . is a unimodular orthogonal matrix (i.e., Det R = 1 and R T R = 1) and 
the indices i, j run over the values 1, 2, 3. With regard to both rotations and the 
space-time translations x" — ► x a + a~, there is no difference between the Lorentz 
group and the Galileo group discussed in Chapter 1. The difference arises only in 
those transformations, called boosts, that change the velocity of the coordinate 
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frame. Suppose that one observer 0 sees a particle at rest, and, a second observer O' 
sees it moving with velocity v. From (2.1.1) we have 


dx' a = A 0 ^ dxP 

(2.1.12) 

or, since dx vanishes, 


dx n = A l 0 dt ( i = l, 2, 3) 

(2.1.13) 

a* 

II 

> 

0 

0 

S? 

(2.1.14) 

Dividing dx' by dt' gives the velocity v, so 


0 

0 

< 

II 

0 

< 

(2.1.15) 

We can get a second relation between A‘ 0 and A° 0 by setting y 
Eq. (2.1.2): 

= 8 = 0 in 

-1 = A« 0 A' 0(f „ = £ (A‘o) 2 — (A° 0 ) 2 

i= 1,2,3 

(2.1.16) 

The solution of Eqs. (2.1.15) and (2.1.16) is 


A° 0 = 7 

(2.1.17) 

A'o = 7»> 

(2.1.18) 

where 


<N 

> 

l 

III 

(2.1.19) 

The other A®^ are not uniquely determined, because if A 0 ^ carries a particle from 
rest to velocity v, then so does A a R y g, where R is an arbitrary rotation. One 

convenient choice that satisfies Eq. (2.1.2) is 


A Wu + *V’v ( 7 t2 1) 

(2.1.20) 

> 

O 

II 

(2.1.21) 


It can easily be seen that any proper homogeneous Lorentz transformation may be 
expressed as the product of a boost A(v) times a rotation R. 


2 Time Dilation 

Although the Lorentz transformations were invented to account for the 
invariance of the speed of light, the change from Galilean relativity to special 
relativity had immediate kinematic consequences for material objects moving at 
speeds less than that of light. The simplest and most important is the time dilation 
of moving clocks. An observer looking at a clock at rest will see two ticks separated 
by a space-time interval dx = 0 , dt = At, where At is the nominal period between 
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ticks intended by the manufacturer. He will calculate the proper time interval 
(2.1.4) as 

dr = ( dt 2 — dx 2 ) 1/2 == At 

A second observer, who sees the same clock moving with velocity v, will observe 
that the two ticks are separated by a time interval dt' and also by a space interval 
dx' = v dt‘ . and he will conclude that the proper time interval is 

dr' = {dt' 2 - dx' 2 ) 112 = (1 - v 2 ) 1/2 dt' 

But both observers are supposed to be using inertial coordinate systems, so their 
coordinate systems are related by a Lorentz transformation, and on comparing 
notes they must find that dr — dr' , in accordance with Eq. (2.1.5). It follows that 
the observer who sees the clock in motion will see it tick with a period 

dt' = A*(l - v 2 ) — 1/2 (2.2.1) 

[For an alternate derivation, use Eqs. (2.1.14), (2.1.17), (2.1.19).] This relation is 
literally being verified every day by experiments that measure the mean lifetime of 
rapidly moving unstable particles from cosmic rays and accelerators. Such particles 
of course do not tick; instead (2.2.1) tells us here that a moving particle will have 
a mean life larger than it has at rest by a factor (1 — v 2 ) _1/2 , in perfect agreement 
with the lifetime measurements made electronically or by measuring the free path 
length. 

The time dilation (2.2.1) is not to be confused with the apparent time dilation 
or contraction known as the Doppler effect. If our “clock” is a moving source of 
light of frequency v = 1 /At, then the time between emission of successive wave 
fronts (say, with a maximum value of some component of the electric field) is 
given by (2.2.1) as dt' = Af(l — v 2 ) -1/2 . However, during this time the distance 
from the observer to the light source will have increased by an amount v r dt', 
where v r is the component of v along the direction from observer to light source. 
Hence the period between reception of wave fronts will be 

dt 0 — (1 + v r ) dt’ — (1 + v r ){l - v 2 r 1/2 At 

That is. the ratio of the frequency of the light actually measured by the observer 
to the frequency of the light source at rest is 

^ = (1 + ^-'(l - V 2 ) 1/2 (2.2.2) 

V 

If the light source is moving away, then v r > 0, and this is necessarily a red shift. 
If the light source is moving transversely, then v r = 0, and we have the pure time 
dilation red shift discussed above. If the light source is moving directly toward the 
observer, then v r = —v, and (2.2.2) gives a violet shift by a factor 


(1 + t;) 1/2 (l - v)~ lJ2 
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The transition from violet to red shift occurs for a source moving at an angle 
between straight toward the observer and at right angles to the line of sight. 


3 Particle Dynamics 

Let us suppose that a particle moves in a field of force at a velocity so high that 
Newtonian mechanics does not suffice to calculate its motion. Let us also suppose, 
as in the case of electrodynamics, that we know how to calculate the force F on our 
particle in any Lorentz frame in which, at a given moment, it is at rest. Then we 
could compute the motion of our particle by performing a Lorentz transformation 
to a frame in which the particle is at rest at some time t 0 , computing the velocity 
dv = F dtjm at the time t 0 + dt , performing another Lorentz transformation to 
bring the velocity to zero again, and so on. Fortunately, there is an easier way. 
Let us define the relativistic force f a acting on a particle with coordinates x a (r) 

by 

d 2 r a 

/“ = m — (2.3.1) 

dr z 

Clearly, if f a were known, we could compute the motion of our particle. We shall 
relate /“ to the Newtonian force by noting two of its properties : 

(A) If the particle is momentarily at rest, then the proper time interval dr 
equals dt, so/ a = F a , where F 1 are the Cartesian components of the nonrelativistic 
force F, and 

F° = 0 (2.3.2) 

(B) Under a general Lorentz transformation (2.1.1), the coordinate differentials 
transform according to dx' a = dxf, while dr, is invariant, so (2.3.1) tells us 
that /“ has the Lorentz transformation rule : 

r = A%/* (2.3.3) 

Any quantity such as dx a or / a that transforms according to Eq. (2.3.3) is called a 
four-vector . 

Now suppose that our particle has velocity v at some moment t 0 , and introduce 
a new coordinate system x'*, defined by 

= K a p {y)x^ 

where A(v) is the “boost” defined by Eqs. (2.1.17)— (2. 1.21). Since A(v) is con- 
structed so as to carry a particle from rest to velocity v, and since our particle has 
velocity v at time t 0 in the coordinate system x a , it must be at rest at this moment 
in the coordinate system x' a . Hence, according to (A), the force four- vector in the 
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coordinate system x'* at time t 0 is equal to the nonrelativistic force F *. And 
therefore, according to (B), the force in our original coordinate system is 


or more explicitly, since F° 


r = A yv)** 

(2.3.4) 

= o, 


f=F + (y- 1 )v (V ' 2 F) 

(2.3.5) 

v 2 


/° = VV-F = vf 

(2.3.6) 


with v the instantaneous velocity. 

Now that we know how to calculate / a , we can use the differential equations 
(2.3.1) to calculate the four dependent variables x*(x), and then eliminate r to 
determine x(t). However, the initial values of dx*jdx must be chosen so that dx 
really is the proper time, that is, so that 


-1 = 


dx* dx/ 
dx dx 


(2.3.7) 


Note that (2.3.7) will be true for all x if it is true at some initial r, providing that its 
derivative vanishes, that is, providing that 


o = 2^/« 


dx & 
dx 


(2.3.8) 


That this is true can be seen either directly from (2.3.4), or more elegantly by 
noticing that the right-hand side is Loren tz -invariant : 


n*r 


dx'P 

dx 




dx 5 

dx 


1y»f 7 


dx 5 

dx 


and that it vanishes by virtue of (2.3.2) in a reference frame in which the particle is 
at rest. 


4 Energy and Momentum 


The relativistic form (2.3.1) of Newton’s second law immediately suggests that 
we define an energy-momentum four- vector 



P 


(2.4.1) 
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and write the second law as 


Recall that 

dr = ( dt 2 

where 


= fa 

dr J 

dx 2 ) l/2 = (1 


dx 

v = — 
dt 


y 2 )^ 2 dt 


(2.4.2) 


Then the space components of p* form the momentum vector 


p = myv 

and its time component is the energy 

p° = E = my 


where 


dt 

y = - = (i 

dr 


1/2 


For small v, these definitions give 


p = mv + 0(v 3 ) 

E — m + \m \ 2 + 0(v 4 ) 


(2.4.3) 

(2.4.4) 

(2.4.5) 

(2.4.6) 

(2.4.7) 


in agreement with the nonrelativistic formulas, except for the term m in E. (Recall 
that in our units 1 sec equals 3 x 10 1 0 cm, so 1 g equals 9 x 10 20 ergs.) Sometimes 
the factor my is called the relativistic mass m , so that p = mv. I do not follow this 
custom here; for us, “mass” will always mean the constant m. 

Why do we call p and E the relativistic momentum and energy ? We can use 
these names for anything we like, but if the concepts of momentum and energy 
are to be useful they must be reserved for quantities that are conserved . The 
unique feature of our p and E is that, if one observer says that they are conserved 
in a reaction, then so will any other observer related to the first by a Lorentz 
transformation. Note that dx 7 is a four- vector whereas m and dr are invariants, so 
the p a for any single particle is a four- vector; that is, it transforms under (2.1.1) 
like 


p' a = A“p pP 


Since A does not depend on anything but the Lorentz transformation being per- 
formed, it follows that in any reaction, the change of the sum of the p 1 of all 
particles is also a four- vector : 


A Srf = A%A 
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(The sums run over all particles, and A denotes the difference between initial and 
final states.) The conservation of p and E in the original inertial frame tells us that 
A Pn vanishes, so in any coordinate system related to the first by a Lorentz 
transformation they will still be conserved; that is, A p' a will vanish. 

(I shall not show here that p and E are the only functions of velocity whose 
conservation is Lorentz-in variant. 3 However, it is worth stressing that E must be 
conserved if p is. For suppose that momentum is conserved in two different 
coordinate systems related by a Lorentz transformation, that is, 

A X Pn = 0 A p' = 0 
n n 

Since A Y, n P n a * s a four- vector, we have 

A X P« = A '/S A S P» f 

n n 

and using momentum conservation in both coordinate systems, this gives 

0 = A ‘oAEft 0 

n 

But A* 0 is not necessarily zero, so p° = E is conserved.) 

At zero velocity the energy E has the finite value m. For this reason we some- 
times give the name “kinetic energy 55 to the quantity E — m, which for small v 
is approximately |mv 2 . If the total mass is conserved in a reaction (as in elastic 
scattering), then the kinetic energy is conserved, but if some mass is destroyed 
(as in radioactive decay or fusion or fission), then very large quantities of kinetic 
energy will be liberated, with consequences of well-known importance. 

The velocity can be eliminated from Eqs. (2.4.3) and (2.4.4), yielding a relath 
between energy and momentum 

^(p) = (P 2 + m 2 ) 1/2 (2.4.8) 

This can also be derived by noting from (2.4.1) and the definition of dr that 

= ~m 2 ( 2 . 4 . 9 ) 

For a photon or neutrino we must set v 2 = 1 and m = 0, so (2.4.3) and 
(2.4.4) become indeterminate, but their ratio gives a relation useful for all particles 


E 

Note that for m — 0 Eq. (2.4.8) gives 

E = |p| 


(2.4.10) 


so v is a unit vector, as of course it must be for a massless particle. 
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5 Vectors and Tensors 

Next we go on to electrodynamics and relativistic hydrodynamics, but it is 
convenient first to pause and outline a notation that makes the Lorentz transforma- 
tion properties of physical quantities transparent. This notation will be extended in 
Chapter 4, on tensor analysis, to encompass general coordinate transformations, 
but in fact few changes will be needed. 

We have already introduced the term “four- vector” for any quantity such as 
dx a or /* or p a that undergoes the transformation 

F a V af = A^F* (2.5.1) 

when the coordinate system is transformed by 

z* x ' a = A a fi a* (2.5.2) 

More precisely, such a F a should be called a contravariant four-vector, to dis- 
tinguish it from a covariant four- vector, defined as a quantity U a whose trans- 
formation rule is 

= A/ u t (2.5.3) 

where 

V = 'M* 4 A ’’a (2.5.4) 

The matrix r}^ 3 introduced here is numerically the same as rj^ s , that is, 

= n fs (2.5.5) 

but we write it with indices upstairs to conform with our summation convention. 
Note that 

= #' = \ + \ * = { (2-5-6) 

(0 a ^ p 

so A/ is the inverse of the matrix A^ a , that is, 

VA% = A% = nrf' = (2.5.7) 

It follows that the scalar product of a contravariant with a covariant four- vector 
is invariant, that is, 

WaV’* = K y U y VP = U p V* (2.5.8) 

To every contravariant four-vector F a there corresponds a covariant four- 
vector 

v « SE (2.5.9) 

and to every covariant £7 a there corresponds a contravariant 

U a = rffiUp 


(2.5.10) 
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Note that raising the index on V x simply gives back V* 7 and lowering the index on 
U* simply gives back U a7 

»/“% = n*% y v y = V- 

= njf'v, = u. 

Xote also that (2.5.9) does yield a covariant, because 

n = ti'fV" = ^ A = w* \\v s 
= a.* V 5 

in agreement with (2.5.3). Similarly, (2.5.10) does yield a contra variant. 

Although any vector can be written in a contra variant or a covariant form, 
there are some vectors, such as dx a , that appear more naturally contravariant and 
others that appear more naturally covariant. An example of the latter is the 
gradient d/dx*, which obeys the transformation rule 


Multiplying (2.5.2) by A a v gives 

so 


6 _ dx? d 

; a dx'* ~ dx'* dxP 


x y - A a V a 
dx$ 


dx'* 




and therefore the gradient is covariant: 


d_ 

dx'* 



(2.5.11) 


One consequence is that the divergence of a contravariant vector dV*jdx* is 
invariant. Another is that the scalar product of d/dx* with itself, the d’Alembertian 
operator 


□ 2 = t]* fi 


A A 

dx& dx * 



(2.5.12) 


is also invariant. 

Many physical quantities are not scalars or vectors, but more complicated 
objects called tensors. A tensor has several contravariant and/or covariant indices 
with corresponding Lorentz transformation properties, for example, 


rpy 


= 








A contravariant or covariant vector can be regarded as a tensor with one index, 
and a scalar is a tensor with no indices. There are several ways of forming tensors 
out of other tensors : 
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(A) Linear Combinations. A linear combination of tensors with the same 
upper and lower indices is a tensor with these indices. For instance, if R a 8 and 
S*p are tensors, and a and b are scalars, and we define 

T% = aR\ + bS a p 

then T a a is a tensor, that is, 


T ,a p = aR'* p + bS^p 

= oA" ? A/2T, + b\% A/S”, 

= A% A *T-> t 

(B) Direct Products. The product of the components of two tensors yields a 
tensor whose upper and lower indices consist of all the upper and lower indices of 
the two original tensors. For instance, if A a p and B y are tensors, and 

Ty == A%B y 

then T (l p y is a tensor, that is, 

T' y = A’* p B ,y 

= A% A/ A \T\t 

(C) Contraction. Setting an upper and lower index equal and summing it 
over its values 0, 1,2, 3, yields a tensor with these two indices absent. For instance, 
if T CL p y0 is a tensor and 

rjvxy „ 

then T xy is a tensor, that is, 


jray _ rpict yfi 

= A% A/ A* c A * K T*f 
= A“ a A \d E K T 6 ^ K 
= A a , A\T*t 

(D) Differentiation. The derivative d/dx a of any tensor is a tensor with one 
additional lower index a. For instance, if T Py is a tensor and 


then T/ y is a tensor, that is, 


rp py 
a 



T'py = v T'Py 
“ dx' a 


= A.' A A', A 

= A', A \T^ 
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Note that the order of indices matters, even as between upper and lower indices. 
For instance. T/ y may or may not be the same as T p a y . 

Aside from the scalars, there are three special tensors whose components are 
the same in all coordinate systems: 

(i) The Minkowski Tensor. The definition of Lorentz transformations tells 
us immediately that is a covariant tensor, 

n. t = At, a% 

Multiplying this equation by tj aE t and using (2.5.6) and (2.5.4), we find that 

V s = 

= A r * A/ 

so r]*® is a contravariant tensor. (Recall that t / aj3 and are numerically the same 
matrix, so this is a matrix that is both covariant and contravariant.) We can form a 
mixed tensor by lowering one index on or raising one index on rj aj8 ; this gives 
the Kronecker symbol 

8 % = 

That this is a tensor follows from rules (B) and (C) and the fact that tj* 7 and rf y ^ 
are tensors. 

(ii) The Levi-Civita Tensor. This is a quantity g* {iyd defined by 

( + 1 if otpyd even permutation of 0123 
s*P yS — j_i if ccfiy § odd permutation of 0123 (2.5.13) 

( 0 otherwise 

Note that 

A* E A\A y K A oc e* P7S 

because the left-hand side must be odd under any single permutation of the indices 
OLpyS. To find the constant of proportionality, set otfiyS = 0123. The left-hand side 
is then simply the determinant of A, which for proper Lorentz transformations is 
unitv. (See Section 2.1.) Thus the constant of proportionality is unity, that is, 

A a e A* c A y K A^ s e ^ kX = s a(iyS (2.5.14) 

and therefore s a/}yS is a tensor. 

(iii) The Zero Tensor. We can define a tensor with an arbitrary pattern of 
upper and lower indices by setting all its components equal to zero. 

Since and rj a p are tensors, we can use them to raise or lower indices on an 
arbitrary tensor; rules (B) and (C) tell us that this gives a new tensor with one 
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more upper or lower index and one less lower or upper index. For instance, it 
T afiy is a tensor, then so is 

In particular, we can lower some or all of the indices on the Levi-Civita tensor 
Lowering all the indices gives back the same numerical quantity except for a 
minus sign: 

= -«* ,ra (2.5.15) 

The point of all this algebra is that it enables us to tell at a glance that an 
equation is Lorentz-invariant. The fundamental theorem is that if two tensors , 
with the same upper and lower indices , are equal in one coordinate system , then they 
are equal in any other coordinate system related to the first by a Lorentz transformation. 
For instance, if T a fi = S a p, then 

T'% = A',A f s T\ = A« 7 A,*S% 

= S'% 

In particular, the statement that a tensor vanishes is Lorentz-invariant. 

The formalism outlined in this section is nothing but a description of the 
representations of the homogeneous Lorentz group. We shall explore these 
representations in greater generality in Section 2.12. 


6 Currents and Densities 


Suppose that we have a system of particles with position x n (t) and charges e n . 
The current and charge densities are usually defined by 


J(x. 0 = 2 e„<5 3 (x - x n (t)) 

n 

e(x, 0 = 2 e n$ 3 ( x - x »(0) 


d x „(t) 

dt 


( 2 . 6 . 1 ) 

( 2 . 6 . 2 ) 


Here (5 3 is the Dirac delta function, defined by the statement that for any smooth 
function f(x), 


/* 

4 


d i xf(x)S i (x - y) = /( y) 


We can unite J and e into a four- vector J a by setting 

J° = £ 


(2.6.3) 


that is 


J“(x) = 2 e n d 3 (x - x„(t)) 


dx/(t) 


dt 


(2.6.4) 
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To show that this is a tour- vector, define x n °(t ) = t , and write (2.6.4) as 


J*{x) = 


/% 

dr £ e n 5\x - *„(«')) 

n 


dr 


The differentials dt' cancel, and hence can be replaced with an invariant dr: 


J«(x) = 


dr 2 e„S 4 (x - x„(r)) 

n d T 


(2.6.5) 


But S 4 (x — x n (t)) is a scalar (because Det A = 1) and dx n x is a four- vector, so J a 
is a four- vector. 

We also note that 


V • J(x, t) = £ e„ ~ <5 3 (x - x„(0) d yM 
n dx dt 

^ s3/ s*\\ (^) 

= -I e »r~^ 3 ( x - X »W 

n dx„ l at 


= ~2 S n J. ^ 3 ( X - X „W) 

it dt 


= --s(x,t) 
dt 


or, in four-dimensional language 


— J*{x) = 0 (2.6.6) 

dx a 


The Lorentz invariance of this statement is evident. 

Whenever any current J x (x) satisfies the invariant conservation law (2.6.6), 
we can form a total charge 


Q 


I' 


d 3 xJ°(x) 


(2.6.7) 


This quantity is time -independent, because (2.6.6) and Gauss’s theorem give 


dQ 

dt 


= jVz ~ J°(x) = - J* d*x V • 3(x) = 0 


If J a (x ) is a four- vector, then Q is not only constant but a scalar. To see this, write 
Q as 


Q = 


d A xJ a (x)d a 9{n 0 x p ) 


( 2 . 6 . 8 ) 
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where 6 is the step function 


0 ( 8 ) 


1 

0 


s > 0 
s < 0 


and n x is defined by 


n l = n 2 = % = 0, 


n o — + 1 


The effect of a Lorentz transformation on Q is then evidently simply to change n : 

«■ - fww 

n’p = A y f n y 

and using (2.6.6), the change in Q is then 

0' - 0 = | d 4 xd.[J‘(xmn^) - 0<V)}] 

The current J*(x) can be presumed to vanish if |x| -► 00 with t fixed, whereas the 
function 0(n^x fi ) — 0(n^) vanishes if |£| -► 00 with x fixed. Hence we can apply 
the four-dimensional Gauss theorem, and find Q' — Q = 0 ; that is, Q is a scalar. 
(For the current density J° defined by (2.6.2) the charge (2.6.7) is 

Q = £ 


which of course is a constant scalar; however, in dealing with the charge and 
current distributions of extended particles it is important to realize that (2.6.7) 
defines a time-independent scalar for any conserved four- vector J a .) 


7 Electrodynamics 

Maxwell’s equations for the electric and magnetic fields E, B produced by a 
given charge density £ and current density J are 


V-E = e 

(2.7.1) 

V x B = — + J 
dt 

(2.7.2) 

V B = 0 

(2.7.3) 

V x E = ' B 

dt 

(2.7.4) 



42 


2 Special Relativity 


To uncover the Lorentz transformation properties of E and B, we introduce a 
matrix F 3 ^. defined by 

F i2 = B 2 F 23 = F 31 = B 2 

F 01 = E l F 02 = E 2 F 03 = E 3 (2.7.5) 

F a V — —F lia: 

Then (2.7.1) and (2.7.2) can be written as 

— F‘ f = -J f (2.7.6) 

dx“ 

(recall that J° = s ) whereas (2.7.3) and (2.7.4) give 

s"^ F yd = 0 (2.7.7) 

where s 3 ^ yS is the Levi-Civita symbol defined in Section 2.5, and F yS is the co variant 
defined as usual by 

Fyt = 

Since J 3 is a four-vector, we conclude that F 3 & is a tensor, 

F'*e = A 3 y A* s F y * (2.7.8) 

because if F ^ is a solution of (2.7.6) and (2.7.7), then (2.7.8) will be a solution in a 
Lorentz -transformed coordinate system. 

The electromagnetic force on a charged particle is 

P = = eF°yF (2.7.9) 

ax ax 


That this is correct may be seen by repeating the arguments of Section 3. Equation 
(2.7.9) is correct in a reference system in which the particle is at rest because in this 
frame it gives f = eE, /° — 0, and it transforms like a four- vector, so it is correct 
for all velocities. Note incidentally that (2.7.9) and (2.4.2) give 

= e[E t v x B] 
dt 


so the formula for magnetic force follows as a consequence of special relativity. 
There is a useful alternate form to the homogeneous equations (2.7.7) : 


_a_ 

dx c 


Fp y + 


S ^ XT r. 

F„„ + — F xfS = 0 


dx p 


dx y 


(2.7.10) 


Note that for a. p. y all different, Eq. (2.7.10) is the same as (2.7.7); for instance, 
setting a = 0 in Eq. (2.7.7) gives the same result as setting ccpy = 123 in Eq. 
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(2.7.10). On the other hand, for two indices equal, Eq. (2.7.10) is an identity: for 
instance, if f3 = y then (2.7.10) reads 

3 3 

—p Ff “ + dx? ^ = ° (n<>t summed) 

and this is identically true because F a p — — Fp x . 

Equation (2.7.7) allows us to represent F yd as a “curl” of a four- vector A y : 

r »-b At 'h A ' (2 - 7 - U) 


(See section 4.11.) 

We can change A y by a term d y (p without affecting F yd , so A y may be defined so 
that 

d a A a = 0 (2.7.12) 

With (2.7.11) and (2.7.12), the rest of Maxwell’s equations reduce to 

0 2 A X = -J x (2.7.13) 


8 Energy- Momentum Tensor 

In Section 5 we introduced the density s and current J of electric charge. We 
now give a similar definition for the density and current of the energy-momentum 
four-vector p a . First consider a system of particles labeled n, with energy- 
momentum four- vectors p„ a {t). The density of p a is defined by 

T“°(xt) = 2>„V)<5 3 (x - x„(«)) (2.8.1) 

n 

and its current is defined by 

T“(xt) = £ p/(t) ^ <5 3 (x - x„(t)) (2.8.2) 

n dt 

These two definitions can be united into a single formula, 

T^(x) = £ p a ‘ d3C {P- d 3 (x - x a (t)) (2.8.3) 

n Clt 

where x n °(t) = t. We note from (2.4.10) that 
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so (2.8.3) can also be written as 

T«(x) = L^’ <? 3 ( x - x „(*» (2.8.4) 

» tin 

and we see that T aP is symmetric: 

T xp {x) = T px {x) (2.8.5) 

We can also write (2.8.3) in analogy with (2.6.5) as 

T*V(x) = X dxp/^- S*(x - x„(T)) (2.8.5 a) 

n J d T 

and we see that T xp is a tensor, that is, 

T'* p = A* y A p s T yS 

under a Loren tz transformation (2.1.1). 

The conservation law for T ap will take a little more thought. Returning to 
(2.8.1) and (2.8.2) we see that 

~ *) = -I ?.*(*) vW< x - x -<‘» 

OX n dt OX n 

= -2>„V)|<5 3 (x - x„(f)) 

n Ct 

= -Jr* 0 <M) + I^« 3 ( x -«.(»)) 

dt n dt 

and so 

~ T xP = G * (2.8.6) 

dx p 

where G 2 is the density of force: 

G'(x, t) = £5 3 (x - = I<5 3 (x - x „(«» §//(*) 

n dt n dt 

If the particles are free, then p n x is constant and T xP is conserved, that is, 

T* f (x) = 0 (2.8.7) 

dx p 


The same is also true if the particles interact only during collisions that are strictly 
localized in space. In this case (2.8.6) gives 
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where x c (t) is the location of the cth collision going on at time t, and use means 
we sum only over the particles participating in the cth collision. But each collision 
conserves momentum, so S«eci ? « a (0 mus t be time-independent, yielding the 
conservation equation (2.8.7). 

The energy-momentum tensor (2.8.3) will not be conserved if the particles are 
subject to forces that act at a distance. For instance, consider a gas of charged 
particles, with charges e n . Then (2.8.6), (2.4.1), and (2.7.9) give 

r) dr y 

t *p { x) = y e j" {x) s i (x __ x j t )) 

OX P n dt 

and, using (2.6.4), this gives 


dx p 


T* p {x) = F a (x)J y {x) 


( 2 . 8 . 8 ) 


Although this is not conserved, we can construct a conserved tensor by adding a 
purely electromagnetic term 

T em * p = F* y F py - \r\* p F y8 F yd (2.8.9) 

That is, the electromagnetic energy and momentum densities are given by 


= i(E 2 + B 2 ) TJ° = (E x B), 


( 2 . 8 . 10 ) 


We note that 


JL t = F a — F Py 4- F py — F x — 4F s — F yd 
dx * em y dx p dx p y 2 y 'dx a 

[Here d/dx a — rj ap (dldx p ).] With a little reshuffling of indices, this becomes 

JL m *P _ ira J_ -nipy _ ( JL WPy _L JL W* J_ JL J?*P 

dx p em 1 dx p 2 dx p dx y 

Using the Maxwell equations (2.7.6) and (2.7.10), we find 

JL t * p = -F a J y 

dx p em 


(2.8.ii; 


Comparing (2.8.8) with (2.8.11), we are led to redefine the energy-momentum 
tensor as 

d'T P 

= 1?/^- S 3 (x - xjf)) + (2.8.12) 

n dt 

This is again a symmetric tensor, and is now conserved 

= 0 


(2.8.13) 
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We can continue to add more and more terms to to account for other fields 
and keep conserved. A systematic method for constructing these terms is 
presented in Chapter 12. 

Just as the integral of the charge density J° is the total charge, the integral 
of the density T 2 ° of p* is the total p x : 

Xotal = jV*2"°(x,t) (2.8.14) 

That this is a constant four-vector can be shown in the same way that we showed 
in Section 6 that the total charge (2.6.7) is a constant scalar. 


9 Spin 


One important use we can make of the energy-momentum tensor T aP is to 
define angular momentum and spin. Consider first an isolated system, for which the 
total energy-momentum tensor T xP is conserved 


dx y 


T Py = 0 


We can use T to construct another tensor, 

M yaft = x *rpfiy _ x P T ay 

and because T is conserved and symmetric, M is also conserved : 

U _ pP* _ T«l3 _ 0 


(2.9.1) 


(2.9.2) 


We can then form a total angular momentum 


ja P 


d 3 xM 0xP = -J p * 


(2.9.3) 


From (2.9.2) we see (by following the arguments of the last section) that is 
constant in time and is a tensor. We further note that 


J ij 


d 3 x(x l T j0 — x j T l °) 


and since T jQ is the density of the jth component of momentum, we may regard 
J 23 , J 31 , and J 1 2 as the 1-, 2-, and 3-components of the angular momentum. The 
other components of are 

= tp ‘ - f x'T 00 d 3 x 
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These components have no clear physical significance, and in fact can be made to 
vanish if we fix the origin of coordinates to coincide with the “center of energy’’ 
at t = 0, that is, if at t = 0 the moment J x l T 00 d 3 x vanishes. 

Although a tensor with regard to the homogeneous Lorentz transformations 
x a -» A a pX p , the total angular momentum behaves peculiarly under the translation 
x * — ► x' x = x a + a*. From (2.9.3) and (2.8.13) we find that 

J'*fi = j*b + a * p B _ a B. p * (2.9 .4) 

This is of course because J ^ includes the orbital angular momentum, which is 
always defined with respect to some center of rotation. In order to isolate the 
internal part of it is convenient to define a spin four -vector 

S. = (2.9.5) 

where s aPyd is the completely antisymmetric tensor discussed in Section 5, and 
U a = p*l(—PppP) 1/2 is the four- vector velocity of the system. Because of the 
antisymmetry of e^ yi5 , the translation x* -*■ x* + a*, which changes J fiy according 
to the rule (2.9.4), does not change S a . Furthermore, $ a is obviously a vector and is 
constant for a free particle 

d SI 

^ = 0 (2.9.6) 

dt 

Finally we note that in the center-of-mass frame of the system U l — 0 and 
U° = 1, so in this frame 

= J 23 , S 2 =J 3 \ S 3 = J 12 , S 0 = 0 (2.9.7) 

This justifies us in regarding S x as the internal angular momentum of the system. 
Even when the velocity IT is not zero, S a really has only three -independent com- 
ponents, because (2.9.5) gives 

U a S x = 0 (2.9.8) 

We use these properties of S a later, when we discuss the precession of a gyro- 
scope in free fall. 


10 Relativistic Hydrodynamics 

A great many macroscopic physical systems, including perhaps the universe 
itself, may be approximately regarded as perfect fluids . A perfect fluid is defined as 
having at each point a velocity v, such that an observer moving with this velocity 
sees the fluid around him as isotropic. This will be the case if the mean free path 
between collisions is small compared with the scale of lengths used by the observer. 
(For instance, a sound wave will propagate in air if its wavelength is large compared 
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with the mean free path, but at very short wavelengths viscosity becomes 
important and the air stops acting like a perfect fluid.) We shall translate the above 
definition of a perfect fluid into a statement about the energy-momentum tensor. 

First suppose that we are in a frame of reference (distinguished by a tilde) in 
which the fluid is at rest at some particular position and time. At this space-time 
point, the perfect fluid hypothesis tells us that the energy- momentum tensor takes 
the form characteristic of spherical symmetry: 

f ij = pSij (2.10.1) 

T i0 = T 0i = 0 (2.10.2) 

T 00 = p (2.10.3) 


The coefficients p and p are called the pressure and the proper energy density , 
respectively. Now go into a reference frame at rest in the laboratory, and suppose 
that the fluid in this frame appears to be moving (at the given space-time point) 
with velocity v. The connection between the comoving coordinates x p and the 
lab coordinates x* is then 

** = A%(v)x fi 


with A a p{v) the “boost” defined by Eqs. (2.1.17)— (2.1.21). But is a tensor, so 

in the lab frame it is 

= AyvJA^v )? 1 " 5 

or explicitly 


V t Vi 


T' 1 = pd tJ + (P + p) --‘-J-j 

1 — V 


T i0 = (p + p) 


T 


00 _ ( P + pv 2 ) 

l - V 2 


(2.10.4) 

(2.10.5) 

( 2 . 10 . 6 ) 


To check that this is a tensor, we note that (2.10.4)~(2.10.6) can be integrated into a 
single equation: 

T** = prf & + ip + p)U a U p (2.10.7) 


where U x is the velocity four- vector, 


normalized so that 


U = — = (1 — v 2 )- 1 / 2 v 
dr 

u° = - = (i - v 2 r i/2 

dr 

U a U* = -1 


( 2 . 10 . 8 ) 


(2.10.9) 
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Indeed, Eq. (2.10.7) could have been derived very easily by noting that the quantity 
on the right-hand side is a tensor, which equals the tensor in a Lorentz frame 
moving with the fluid, and hence must equal T ap in all Lorentz frames. 

Apart from energy and momentum, a fluid will in general carry one or more 
conserved quantities, such as the charge, the number of baryons minus the number 
of antibaryons, or, at normal temperatures, the number of atoms. Let us consider 
one such conserved quantity, and refer to it for brevity as the ‘ ‘particle number.” 
If n is the particle number density in a Lorentz frame that moves with the fluid at 
a given space-time point, then in this frame the particle current four-vector at 
this point is 

N 1 ' = 0 N° = n (2.10.10) 


In any other Lorentz frame, in which the fluid at this point moves with velocity 
v, the particle current is related to (2.10.10) by the “boost” A(v) : 


N* = A^v)!^ = (1 — v 2 ) 1/2 v l n 
N° = A ° p {v)N p = (1 - v 2 )~ l/2 n 


or, more concisely, 

N* = nU a 


( 2 . 10 . 11 ) 

( 2 . 10 . 12 ) 

(2.10.13) 


The motion of the fluid will be governed by the equations of conservation of 
energy and momentum, 

0 = 77 = T~ + A Up + P)U°UI>] (2.10.14) 

dx P dx a dx p 

and of the particle number : 


0 = d _El = JL ( n *7«) = i (n(l - v 2 r 1/2 ) + V • (rev(l - v 2 ) 

dx a dx a dt 


l/2j 

(2.10.15) 


It is convenient to write (2.10.14) as separate three- vector and scalar equations. 
The three-vector equation is obtained by setting a = i in Eq. (2.10.14), writing 
U' — v'U 0 , and then using Eq. (2.10.14) with a = 0; this gives 


— + (v • V)v = 

dt 


(1 ~ v 2 ) 

(P + P) 




dp 

Vp + v ^ 

dt _ 


(2.10.16) 


The scalar equation is obtained by multiplying Eq. (2.10.14) by U a ; using the 
relation 
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we then have 


0 = 


u. 


dT aP 

dx p 


- 
dx & 


d__ 

dx p 


[(p + p)U*\ 


Using Eq. (2.10.15), we can write this as 



(2.10.17a) 


The second law of thermodynamics tells us that the pressure p, the energy density 
p, and the volume per particle 1 /n may be expressed as functions of the temperature 
T and the entropy per particle ok, in such a w T ay that 


IcT do = 



(2.10.18) 


(Boltzmann’s constant k is introduced here to make o dimensionless.) Our scalar 
equation (2.10.17a) can now be written 


0 - 


XJP 

dx“ 


oc 


do 

dt 


+ 


(v ■ \)o 


(2.10.19) 


The specific entropy o is therefore constant in time at any point that moves along 
with the fluid. The fundamental equations of relativistic hydrodynamics are the 
“continuity equation” (2.10.15), the “Euler equations” (2.10.16), the “energy 
equation” (2.10.19), together with equations of state that give p and p in terms of 
n and o. 

In order to gain some insight into the possible equations of state, we may 
consider a fluid composed of structureless point particles that interact only in 
spatially localized collisions. As shown in Section 2.8, the energy- momentum 
tensor is 


T’t = <5 3 (x - x s ) (2.10.20) 

N E n 

[See Eq. (2.8.4.)] In a comoving Lorentz frame, T* p will have the isotropic form 
(2.10.1)-(2.10.3), so the pressure and energy density will be given in this frame by 

P = * £ T “ = 1 1 ^ <5 3 (* - x iv) (2.10.21) 

p = T 00 = Y *V5 3 (x - x„) 

N 


( 2 . 10 . 22 ) 
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whereas the particle number density is, in analogy with (2.6.2). 

It follows that in general 


» = 2 <5 3 (x - x N ) 

N 


0 < p < 


For a cool, nonrelativistic gas, we can approximate 


(2.10.23) 


(2.10.24) 


E n ~ m + r * 


P 

2m 


so (2.10.22) gives 

p ~ nm + -fp 

For a hot, extremely relativistic gas, we have 

e n - IPjvI > m 

so (2.10.22) gives 

p ~ 3/> :>> nm 

Both (2.10.25) and (2.10.26) can be incorporated into a single equation, 

p — nm ~ (y — 1 )~ 1 p 


(2.10.25) 


with 


y = 


j nonrelativistic 

f extreme relativistic 


Equation (2.10.18) then gives 


kTd ° = ^(9 + <? " 1)_1 "in) = 7^i d {n) 


Thus Eq. (2.10.19) takes the form 


0 = 8 -(-\ + (▼ • V) ( K 


dt \ n y 


(2.10.26) 

(2.10.27) 

(2.10.28) 

(2.10.29) 

(2.10.30) 


and (2.10.27) is to be used to express p in terms of n and p in Eq. (2.10.16). The 
proportionality expressed in Eq. (2.10.27) between internal energy and pressure 
actually holds, with various values of y, over a class of fluids much wider than the 
simple gas of point particles discussed here. For all such fluids, the energy equation 
can be put in the form (2.10.30). 

As an example, let us calculate the speed of sound in a static homogeneous 
relativistic fluid. In the unperturbed state, we have n, p, p, and o constant in space 
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and time, and v = 0. The sound wave produces small changes n t , p x , p t , and Vj 
in n, p, p, and v. but according to (2.10.19), it leaves a unchanged. To first order in 
small quantities. Eqs. (2.10.15) and (2.10.16) then read 


dn y 

dt 


+ n\ • v, 


= 0 


dv l = V ^i 
dt p + p 


But with dc r = 0, Eq. (2.10.18) gives 


so that 


where 


0 = J_p±p) n + Pi 



Combining the equations for n 1 and v 1} we obtain a wave equation 


(2.10.31) 


0 = 



that shows that sound waves travel with the speed v s , just as in a nonrelativistic 
fluid. The speed of sound is much less than the speed of light (i.e., unity) for a 
nonrelativistic fluid, but it increases with temperature, so it is worth checking 
whether v s might exceed unity for a fluid of highly relativistic point particles, such 
as hydrogen above 10 13o K. In this case, (2.10.26) and (2.10.31) give a sound 
speed 

v s = (2.10.32) 

V3 


which is still safely less than unity. This conclusion would not be affected if electro- 
magnetic forces were taken into account, because Eqs. (2.10.7) and (2.8.9) impose 
on the electromagnetic pressure p em and energy density p em the relation 

0 = T em \ = 3 i> em - Pcm (2.10.33) 

so the inclusion of p cm and p em would not invalidate (2.10.26) or (2.10.32). It is an** 
open question whether v s remains less than unity when nonelectromagnetic forces 
are taken into account. 4 
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II Keiativistic Imperfect fluids* 

The last section dealt with a perfect fluid, in which mean free paths and times 
are so short that perfect isotropy is maintained about any point moving with the 
fluid. In practice, one often has to deal with somewhat imperfect fluids, in which 
the pressure, density, or velocity vary appreciably over distances of the order of a 
mean free path, or over times of the order of a mean free time, or both. In such 
fluids, thermal equilibrium is not strictly maintained, and the fluid kinetic energy 
is dissipated as heat. 

The correct treatment of dissipative effects for relativistic fluids raises certain 
delicate questions of principle, which do not arise in the nonrelativistic case. For 
this reason, and also because dissipation plays an increasingly important role in 
theories of the early universe (see Sections 15.8, 15.10, 15.11), it will be worth our 
while here to develop the outlines of the general theory of relativistic imperfect 
fluids. 

We suppose that the presence of weak space-time gradients in an imperfect 
fluid has the effect of modifying the energy-momentum tensor and particle current 
vector by terms A T x{i and A N*, which are of first order in these gradients. Instead 
of (2.10.7) and (2.10.13), we then have 

T a * = pf fi + (p + p)U*U fi + A T** (2.11.1) 

N* = nil* + AN* (2.11.2) 

Once we allow such correction terms, the definitions of the pressure p, energy 
density p, particle density n, and fluid velocity U* become somewhat ambiguous. 
The general practice is to define p and n as the total energy density and particle 
number density in a comoving frame : 

T 00 = p (2.11.3) 

N° = n (2.11.4) 

a comoving frame being characterized by the condition that at a given point, the 
velocity four- vector is 

U l = 0 U° = 1 (2.11.5) 

In addition, the pressure p is generally defined to be the same function of p and n 
[e.g., (2.10.27)] as in the case where all fluid gradients are negligible and dissipation 
is absent. Finally, it is necessary in a relativistic fluid to specify whether U* is the 
velocity of energy transport or particle transport. In the approach of Landau and 
Lifshitz, 5 U * is taken to be the velocity of energy transport, so that T l ° vanishes 
in a comoving frame. In the approach of Eckart, 6 U x is taken to be the velocity of 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 

reading. 



54 


2 Special Relativity 


particle transport, so that it is N l that vanishes in a comoving frame. The two 
approaches are perfectly equivalent, but Eckart’s seems to me to be slightly more 
convenient, and will be adopted here. With this definition of U a , we then have in a 
comoving frame 

A 1 ' = 0 (2.11.6) 

A comparison of (2.11.3)-(2.11.6) with (2.11.1) and (2.11.2) shows that in a co- 
moving frame, the dissipative terms A T* p and A A* are subject to the constraints 

AT 00 = AA° = AA 1 ' = 0 (2.11.7) 

and therefore, in a general Lorentz frame, 

U^AT^ = 0 ( 2 . 11 . 8 ) 

AA a = 0 (2.11.9) 

All effects of dissipation thus show up as contributions to A Our task is now to 
construct the most general possible dissipative tensor A allowed by Eq. (2.11.8) 
and by the second law of thermodynamics. 

To this end, let us calculate the entropy produced by fluid motions. As in the 
last sections, we start by contracting the conservation law (2.8.7) with U a : 

( 2 . 11 . 10 ) 

By following the same reasoning that was used to derive (2.10.19) for a perfect 
fluid, one sees that in general 

U.4-. [P«7* + (P + P)U •&>] = -hT A 0 mU «) 

r\nrr 


where T and ok are the temperature and entropy per particle, defined by Eq. 
(2.10.18). Hence (2.11.10) now reads 


or equivalently 


where 


d_ 

dx* 


( noU a ) = — U— B A T ap 

kT “dxP 


dS * 
dx a 


- A T** + A- u a AT 0 

T dx p T 2 dx p 


( 2 . 11 . 11 ) 


8 a = nkoU* - T-'UpAT 1 # (2.11.12) 

The entropy density in a comoving frame is nko = S°, so we may interpret S a as 
the entropy current four- vector, and Eq. (2.11.11) thus gives the rate of entropy 
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production per unit volume. The second law of thermodynamics then requires that 
A be a linear combination of velocity and temperature gradients, such that the 
right-hand side of (2.11.11) is positive for all possible fluid configurations. Note that 
this is only possible because we have included the second term in Eq. (2.11.12); 
without this term, dS^/dx* would not be simply quadratic in first derivatives, and 
hence could not be positive for all fluid configurations. Note also that AT a(i is not 
allowed to involve gradients of p, p, n, and so on, because if it did then (2.11.11) 
would contain products of pressure or density gradients with velocity or tempera- 
ture gradients, and, again, these products would not be positive for all fluid 
configurations. 

It is convenient at this point to go over to a comoving frame, in which U a 
has the form (2.11.5) at a given space-time point P. From (2.10.17), it follows that 
in this frame, all gradients of U° vanish at P. Setting U\ dU°/dx a , and AT 00 equal 
to zero in Eq. (2.11.11), we find that in a Lorentz frame comoving at P , the rate of 
entropy production per unit volume at P is 


dS a 

dx* 


T i + T 1 dx 1 


A T iQ 


1 _ ^ rpij 

T dx j 


(2.11.13) 


In order for this to be positive for all possible fluid configurations, we must have 


A T i0 = -x(~. + TV 

la* 1 


A T ,J = -ri ( d -Fi + 8 Fl — - V • V5,j ) — ?V • U<5 l7 
' dx J 8x' 3 J > J 


with positive coefficients 


X>0, ri > 0, £>0 


(2.11.14) 

(2.11.15) 


(2.11.16) 


so that (2.11.13) reads 


3 Cfa y 

= -A- (\T + TV) 2 
dx a T 2 


A /aj/ 

2 T\8x J dx' 3 J 


dx 1 dx ■ 3 1 


•") 


+ - (V • U) 2 > 0 
T 


(2.11.17 


Except for the relativistic correction TV in (2.11.14), the form of (2.11.14) and 
(2.11.15) is the same as in the nonrelativistic theory of imperfect fluids, 5 and we 
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therefore may identify y, rj, and f as the coefficients of heat conduction, shear 
viscosity, and bulk viscosity. 

It now only remains to translate our results from the forms (2.11.5), (2.11.7), 
(2.11.14), (2.11.15), which are valid only in comoving frames, to forms valid in 
general Lorentz frames. Let us define a shear tensor , 


a heat -flow vector, 


W 


dU„ dU 




dx p dx a 


2 dTP 


Q. = 


8T_ t 8U . 

dx a dx p 


UP 


(2.11.18) 


(2.11.19) 


and a projection tensor on the hyperplane normal to U a : 

H<xp = >?«/? + U a Up ( 2 . 11 . 20 ) 

It is straightforward to check that in a comoving Lorentz frame, our formulas 
(2.11.7), (2.11.14), (2.11.15) for A T xp are satisfied by the tensor 


A T*P = -riH ay H p *W ya 

pjTjy 

-X(W>V f + H^U*)Q - CH’f— (2.11.21) 

dx y 


Since this formula is Lorentz-invariant, and valid in a comoving Lorentz frame, it is 
valid in all Lorentz frames. 

In general, the coefficients jT, rj, and f might be expected on dimensional 
grounds to be of the order of the pressure, or the thermal energy density, times 
some sort of mean free time. However, there are important special cases 0 in which 
the bulk viscosity f is much smaller than rj or yT* To see when this applies, note 
that (2.11.1) and (2.11.21) give the trace of the total energy-momentum tensor as 

5”. = 3p-p-3C^' (2.11.22) 


Suppose that we are dealing with a medium for which this trace can be expressed 
as a function of p and n alone : 


T\ = f(p, n) 


(2.11.23) 


For instance, for the simple gas characterized by (2.10.20), this trace is 
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In the extreme relativistic case, we have E N > m, so in this case (2.11.23) is 
satisfied, with 

/(p, n) ~ 0 


In the nonrelativistic case, we have 


„ L _ ( E » - m 

E n m \ rn 2 

so in this case (2.11.23) is again satisfied, with 

/(p, n) or —mn + (p — ww) 

In the absence of velocity gradients, Eqs. (2.11.22) and (2.11.23) would give a 
formula for the pressure 

P = UP +f(p,n)] (2.11.24) 

But we have agreed to define p in general as the same function of p and n as in the 
absence of dissipation, so (2.11.24) must hold even in the presence of velocity 
gradients, and therefore (2.11.22), (2.11.23), and (2.11.24) give 

c = 0 (2.11.25) 


It would be wrong, however, to conclude that f is generally negligible. As we have 
seen, the trace of the energy-momentum tensor for a simple gas is a function of p 
and n only in the extreme relativistic or extreme nonrelativistic limit; for kT of 
order m, cannot be expressed in the form (2.11.23), and the bulk viscosity is 
of the same order as the shear viscosity . 7 The bulk viscosity is also important 8 in a 
fluid that allows an easy exchange of energy between translational and internal 
degrees of freedom, as in the case of a gas of rough spheres . 9 Another case, of 
particular importance to cosmology, is that of a material medium with very short 
mean free times, interacting with radiation quanta with a finite mean free time t. 
In this case, the coefficients of heat conduction, shear viscosity, and bulk viscosity 
are calculated to be 1 0 

l = %aT z t (2.11.26) 

t] = -t 5 -aT\ (2.11.27) 


f 


4aT 4 r 


1 

3 



(2.11.28) 


where a is the Stefan-Boltzmann constant, defined so that the radiation energy 
density is aT 4 , and p and p are the total pressure and energy density of the matter 
and radiation. Note that, in general, yT, rj , and f are comparable, but if the pressure 
and thermal energy are dominated by radiation, then ( dpjdp) n — 3 and. as 
expected, the bulk viscosity will be small. 
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5 * 

12 Representations of the Lorentz Group* 

The tensor formalism described in Section 2.5 is perfectly adequate for handling 
the problems of relativistic classical physics. However, there are certain formal 
advantages in looking at these Lorentz transformation rules in a more general way, 
from the perspective of the theory of the representations of the homogeneous 
Lorentz group. We shall see in Section 12.5, that this approach allows an elegant 
reformulation of the effects of gravitation on arbitrary physical systems. Also, it 
is only in this way that we can deal with fields of half-integer spin. 

Under the general Lorentz transformation rule, a set of quantities \j/ n trans- 
form under a Lorentz transformation A into the new quantities: 

K = I (2-12.1) 

m 

In order for a Lorentz transformation followed by a Lorentz transformation A 2 
to give the same result as the Lorentz transformation A ± A 2 , it is necessary that the 
matrices D(A) should furnish a representation of the Lorentz group, that is, 

D(A 1 )D{A 2 ) = D(A 1 A 2 ) (2.12.2) 

with matrix multiplication now understood. For instance, if \j/ is a contra variant 
vector F a , then D (A) is simply 


[D( A)]‘„ - A a /j (2.12.3) 

whereas for a co variant tensor T xji , the corresponding D-matrix is 

lD(S)lg y5 = AJ A/ (2.12.4) 

It is easy to check that (2.12.3) and (2.12.4) do satisfy the group multiplication rule 
(2.12.2). We can compile a catalogue of all possible Lorentz transformation rules by 
constructing the most general representation of the homogeneous Lorentz group. 

In fact, the most general true representations of the homogeneous Lorentz 
group are provided by the tensor representations, such as (2.12.3) and (2.12.4), so 
we might expect that all quantities of physical interest should be tensors. However, 
there are additional representations of the infinitesimal Lorentz group, the spinor 
representations, that play an important role in relativistic quantum field theory. 
The infinitesimal Lorentz group consists of Lorentz transformations infinitesimally 
close to the identity, that is, 

A", = 8% + (2.12.5) 

|oA| < l 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 



12 Representations of the Lorentz Group 


59 


In order for this to satisfy the fundamental condition (2.1.2) for a .Lorentz trans- 
formation, we must have 

(< 5 “, + + (a\)n^ = tj y s 

or, to first order in to, 

co y s = -a*, ( 2 . 12 . 6 ) 

with the indices on to of course lowered with r\ : 

For such a transformation, the matrix representation D( A) must be infinitesimally 
close to the identity 

D(1 + co) = 1 + (2.12.7) 

where are a fixed set of matrices, which by virtue of (2.12.6) can always be 
chosen to be antisymmetric in a and /? : 

Oaf = -Of* ( 2 . 12 . 8 ) 

For instance, for the tensor representations (2.12.3) and (2.12.4), we have 

[o a f\\ = ssn,, - v»»«* < 2 - 12 - 9 ) 

- if A 

+ - VfA C S\ (2.12.10) 

The matrices er a/J are not allowed to be just any set of constant matrices, but 
must be constrained so that D(A) satisfies the group multiplication rule (2.12.2). 
It is convenient first to apply this rule to the product A[1 -f m]A _ 1 : 

D(A)D(1 + w)D{ A" 1 ) = D{ 1 + AcoA~ 1 ) 

To zero order in to, this simply says that 1 = 1, whereas to first order, we must 
equate the coefficients of co x p on both sides : 

D{A) a , f D(\- l ) = K\ \% (2.12.11) 

If we now set A = 1 + to (not necessarily with the same to) and A - 1 = 1 — to. 
then this will be satisfied to first order in to provided that a satisfies the com- 
mutation relations, 

[^a/7’ f}yf}®<xd ^lyx^pd "f" tfsp^yx Vda^yP (2.1-.1-) 

with square brackets denoting the usual matrix commutator 


[u, v] = uv — vu 
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The reader can easily cheek that the matrices (2.12.9) and (2.12.10) do satisfy 
Eq. (2.12.12). The problem of finding the general representations of the infinitesimal 
homogeneous Lorentz group is thus reduced to the problem of finding all matrices 
that satisfy the commutation relations (2.12.12). 

These commutation relations can be put in a somewhat more familiar form by 
defining the matrices 

= it-™ 23 + ffiol b l = i[-^23 

a 2 = i[~ 2 '“31 + ^ 20 ] b 2 = 

a 3 ~ + ^So] b 3 ~ W 12 

Equation (2.12.12) then takes the form 

a x a = isi (2.12.14) 

b x b = ih (2.12.15) 

\a t , bj ] = 0 (2.12.16) 

Equations (2.12.14)-(2.12.16) are simply the commutation relations for a pair of 
independent angular momentum matrices. The rules for constructing such matrices 
can be found in any book on nonrelativistic quantum mechanics 11 : In the most 
general case a and b are a direct sum of “irreducible” components, each character- 
ized by an integer or half-integer A or B, with 

a 2 = A(A + 1) b 2 = B(B + 1) (2.12.17) 

and with dimensionality 2A + 1 or 2 B + 1. Thus the most general objects ij/ n , 
which transform linearly under infinitesimal homogeneous Lorentz transforma- 
tions, can be decomposed into “irreducible” pieces, characterized by a pair of 
integers and/or half-integers (A, B), each piece having (2A + 1)(2I> + 1) 
components. 

A straightforward calculation shows that the contra variant vector representa- 
tion (2.12.9), as well as its co variant counterpart, has A = B = Any tensor 
representation, such as (2.12.10), can be regarded as a direct product of vector 
representations, so it consists only of irreducible components with A -f- B an 
integer: for instance, the general second-rank tensor representation (2.12.10) 
consists of irreducible components with (A, B) equal to (1, 1), (1,0), (0, 1), and 
(0, 0). The representations in which A + B is a half -integer are quite distinct from 
the tensors, and are called spinor representations. The most familiar example is the 
Dirac electron field, which consists of components with {A, B) equal to (i, 0) 
and (0, £). 

The transformation property of any object under ordinary spatial rotations is 
determined by its behavior with respect to infinitesimal Lorentz transformations 
(2.12.5) for which co io = 0, and hence by the structure of the purely spatial com- 


- ffio] 

- a 20 ] (2.12.13) 

- °3o] 
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ponents er 12 , (7 23 , er 31 of From these components, we can construct a matrix 
vector 

S = a + b = -i{u 23,(731,(712} (2.12.18) 

that according to (2.12.14)-(2.12.16)' has the commutation relations of an angular 
momentum : 

s x s = is (2.12.19) 

Any irreducible representation (A, B) of the homogeneous Lorentz group can be 
decomposed 1 1 into pieces with s 2 equal to <s(s + 1), where s is an integer or half- 
integer between | A — B\ and A + B; each term describes excitations (e.g., 
particles) of spin s. It follows then from (2.12.18) that the tensor representations 
can describe only excitations with integer spin, whereas the spinor representations 
describe only excitations with half-integer spin. 

Finite Lorentz transformations can be built up by multiplying together an 
infinite number of infinitesimal Lorentz transformations. In the same way, the 
tensor representations of the infinitesimal Lorentz group can be used to construct 
the tensor representations, such as (2.12.3) and (2.12.4), of the group of finite 
Lorentz transformations. However, if we try to construct spinor representations of 
the finite Lorentz transformations, we find that we can only get “representations 
up to a sign” j 1 2 that is, the group multiplication law (2.12.2) will occasionally have 
a minus sign on the right-hand side. For instance, the product of two successive 
180° rotations about a given axis does not give the unit matrix, but minus the 
unit matrix. The appearance of these minus signs means that a spinor field itself 
cannot be a physical observable, though even functions of spinor fields can be 
observables. 


13 Temporal Order and Antiparticles* 

One of the most striking features of the Lorentz transformations is that they 
do not leave invariant the order of events. For instance, suppose that in one 
reference frame an event at x 2 is observed to occur later than one at aq, that is, 
x 2 ° > aq°. A second observer who sees the first observer moving with velocity v 
will see the events separated by a time difference 

x' 2 ° - x[° = A°»(v)(a: 2 “ - V) 

where A^ a (v) is the “boost” defined by (2.1.17)~{2.1.21). Applying (2.1.17) and 
(2.1.21) gives then 

x f 2 ° x[° = y(x 2 ° - aq°) + w • (x 2 x.) 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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and this will be negative if 

v(x 2 - X,) < -(x 2 ° - x !°) (2.13.1) 

At nrst sight this might seem to raise the danger of a logical paradox. Suppose that 
the first observer sees a radioactive decay A -> B + C at x x , followed at x 2 by 
absorption of particle B, for example, B + D -> E. Does the second observer then 
see B absorbed at x 2 before it is emitted at x x % The paradox disappears if we note 
that the speed |v| characterizing any Lorentz transformation A(v) must be less 
than unity, so that (2.13.1) can be satisfied only if 

|x 2 - x x \ > \x 2 ° - x x °\ (2.13.2) 

However, this is impossible, because particle B was assumed to travel from x x to 
x 2 , and (2.13.2) would require its speed to be greater than unity, that is, than the 
speed of light. To put it another way, the temporal order of events at x x and x 2 is 
affected by Lorentz transformations only if x x — x 2 is spacelike, that is, 

Vap( x l ~ x lV( x t ~ x lf > 0 

whereas a particle can travel from x x to x 2 only if x x — x 2 is timelike, that is, 

~ X 2) a ( X l - X 2 ) fi < 0 

Although the relativity of temporal order raises no problems for classical 
physics, it plays a profound role in quantum theories. The uncertainty principle 
tells us that when we specify that a particle is at position x t at time t x , we cannot 
also define its velocity precisely. In consequence there is a certain chance of a 
particle getting from x x to x 2 even if x x — x 2 is spacelike, that is, |x x — x 2 | > 
\x x Q — x 2 ° |. To be more precise, the probability of a particle reaching x 2 if it 
starts at x x is nonnegligible as long as 

(x, - x 2 ) 2 - (x, 0 - x 2 0 ) 2 < ~ 

m 

where h is Planck’s constant (divided by 2n) and m is the particle mass. (Such 
space-time intervals are very small even for elementary particle masses; for in- 
stance, if m is the mass of a proton then h/m = 2 x 10“ 14 cm or in time units 
6 x 10” 2 5 sec. Recall that in our units 1 sec = 3 x 10 10 cm.) We are thus faced 
again with our paradox ; if one observer sees a particle emitted at x x , and absorbed 
at x 2 , and if (x t — x 2 ) 2 — {x x Q — x 2 0 ) 2 is positive (but less than h 2 /m 2 ), then a 
second observer may see the particle absorbed at x 2 at a time t 2 before the time t x 
it is emitted at x x . 

There is only one known way out of this paradox. The second observer must 
see a particle emitted at x 2 and absorbed at x x . But in general the particle seen by 
the second observer will then necessarily be different from that seen by the first. 
For instance, if the first observer sees a proton turn into a neutron and a positive 
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pi-meson at aq and then sees the pi-meson and some other neutron turn into a 
proton at x 2 , then the second observer must see the neutron at x 2 turn into a proton 
and a particle of negative charge, which is then absorbed by a proton at x l that 
turns into a neutron. Since mass is a Loren tz invariant, the mass of the negative 
particle seen by the second observer will be equal to that of the positive pi-meson 
seen by the first observer. There is such a particle, called a negative pi-meson, and 
it does indeed have the same mass as the positive pi-meson. This reasoning leads 
us to the conclusion that for every type of charged particle there is an oppositely 
charged particle of equal mass, called its antiparticle. Note that this conclusion 
does not obtain in nonrelativistic quantum mechanics or in relativistic classical 
mechanics; it is only in relativistic quantum mechanics that antiparticles are a 
necessity. 1 3 And it is the existence of antiparticles that leads to the characteristic 
feature of relativistic quantum dynamics, that given enough energy we can create 
arbitrary numbers of particles and their antiparticles. 
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PART TWO 
THE GENERAL THEORY 
OF RELATIVITY 




“Either the well was very 
deep, or she fell very slowly, 
for she had plenty of time 
as she went down to look 
about her, and to wonder 
what was going to happen 
next.” Lewis Carroll, Alice's 
Adventures in Wonderland 


3 THE PRINCIPLE 
OF EQUIVALENCE 


The Principle of the Equivalence of Gravitation and Inertia tells us how an 
arbitrary physical system responds to an external gravitational field. We shall first 
see what this principle says, and then in the balance of this chapter we shall take a 
look at a few of its consequences. However, the appropriate mathematical 
technique for implementing the Principle of Equivalence is tensor analysis, and 
only after we complete the introduction to tensor analysis in the next chapter 
will we be able to make use of the full content of this principle. 


1 Statement of the Principle 

The Principle of Equivalence rests on the equality of gravitational and inertial 
mass, demonstrated by Galileo, Huygens, Newton, Bessel, and Eotvos. (See 
Section 1.2.) Einstein reflected that, as a consequence, no external static homo- 
geneous gravitational field could be detected in a freely falling elevator, for the 
observers, their test bodies, and the elevator itself would respond to the field with 
the same acceleration. This can be easily proved for a system of particles N, 
moving with nonrelativistic velocities under the influence of forces F(xjy — x w ) 
{e.g., electrostatic or gravitational forces) and an external gravitational field g. 
The equations of motion are 

m ” + £ F ( x « - x m) 

dt m 


( 3 . 1 . 1 ) 
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Suppose that we perform a non- Galilean space- time coordinate transformation 

x' = x — ^-g t 2 t' = t (3.1.2) 

Then g will be canceled by an inertial “force,” and the equation of motion will 
become 

m »S = S r (4-4) (3-1-3) 

at * at 

Hence the original observer 0 who uses coordinates xt, and his freely falling friend 
O' who uses x't ', will detect no difference in the laws of mechanics, except that 0 
will say that he feels a gravitational field and O' will say that he does not. The 
equivalence principle says that this cancellation of gravitational by inertial force 
(and hence their equivalence) will obtain for all freely falling systems, whether or 
not they can be described by simple equations such as (3.1.1). 

We are not yet ready to state the Principle of Equivalence in its final form, 
because the preceding remarks dealt only with a static homogeneous gravitational 
field. Had g depended on x or t , we would not have been able to eliminate it from 
the equations of motion by the acceleration (3.1.2). For example, the earth is in 
free fall about the sun, and for the most part we on earth do not feel the sun’s 
gravitational field, but the slight inhomogeneity in this field (about 1 part in 6000 
from noon to midnight) is enough to raise impressive tides in our oceans. Even the 
observers in Einstein’s freely falling elevator would in principle be able to detect 
the earth’s field, because objects in the elevator would be falling radially toward the 
center of the earth, and hence would approach each other as the elevator descended. 

Although inertial forces do not exactly cancel gravitational forces for freely 
falling systems in an inhomogeneous or time- dependent gravitational field, we can 
still expect an approximate cancellation if we restrict our attention to such a small 
region of space and time that the field changes very little over the region. Therefore 
we formulate the equivalence principle as the statement that at every space-time 
point in an arbitrary gravitational field it is possible to choose a “ locally inertial 
coordinate system ” such that, within a sufficiently small region of the point in question, 
the laws of nature take the same form as in unaccelerated Cartesian coordinate systems 
in the absence of gravitation. There is a little vagueness here about what we mean by 
“the same form as in unaccelerated Cartesian coordinate systems,” so to avoid any 
possible ambiguity we can specify that by this we mean the form given to the laws 
of nature by special relativity, for example, such equations as (2.3.1), (2.7.6), 
(2.7.7), (2.7.9). and (2.8.7). There is also a question of how small is “sufficiently 
small." Roughly speaking, we mean that the region must be small enough so that 
the gravitational field is sensibly constant throughout it, but we cannot be more 
precise until we learn how to represent the gravitational field mathematically. 
(See the end of Section 4.1.) 

The attentive reader may have noticed a certain resemblance between the 
Principle of Equivalence and the axiom which Gauss took as the basis of non- 
Euclidean geometry. The Principle of Equivalence says that at any point in space- 
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time we may erect a locally inertial coordinate system in which matter satisfies the 
laws of special relativity. We saw in Chapter 1 that Gauss assumed that at any 
point on a curved surface we may erect a locally Cartesian coordinate system in 
which distances obey the law of Pythagoras. Because of this deep analogy, we 
should expect the laws of gravitation to bear a strong resemblance to the formulas 
of Biemannian geometry. In particular, Gauss’s assumption implies that all inner 
properties of a curved surface can be described in terms of derivatives d^/dx* 1 of 
the function £ a ( x ) that defines the transformation x -> £ from some general 
coordinate system x ** covering the surface to the locally Cartesian system 
whereas the Principle of Equivalence tells us that all effects of a gravitational field 
can be described in terms of derivatives d^jdx* 1 of the function f X (x ) that defines 
the transformation from the “laboratory” coordinates x ^ to the locally inertial 
coordinates £ a . Furthermore, it was shown in Chapter 1 that the geometrically 
relevant functions of these derivatives are the quantities g defined by Eq. (1.1.7) ; 
we shall see in the following sections of this chapter that the gravitational field is 
described in just the same way. 

Occasionally one finds references to a “weak Principle of Equivalence” and a 
“strong Principle of Equivalence.” The strong Principle of Equivalence is just what 
I have already stated, with “laws of nature” meaning all the laws of nature. The 
weak principle is the same, but with “laws of nature” replaced by “laws of motion 
of freely falling particles.” That is, the weak principle is nothing but a restatement 
of the observed equality of gravitational and inertial mass, whereas the strong 
principle is a generalization of these observations that governs the effects of 
gravitation on all physical systems. 

The experiments of Eotvos, Dicke, and their predecessors (see Section 1.2) 
provide direct verification only of the weak Principle of Equivalence, but they 
provide some indirect evidence for the strong principle. The mass of different 
substances arises in different proportions from the masses of the neutrons and 
protons plus electrons of which they are composed, and from the strong and 
electromagnetic forces that bind these particles together, so the ratio of gravita- 
tional to inertial mass will be equal for all these substances only if it is equal for 
their constituents. Wapstra and Nijgh 1 have shown that the limits set by Eotvos 
on any possible inequality in the ratio of gravitational to inertial mass for glass, 
cork, antimonite, and brass imply that this ratio is equal for neutrons and protons 
plus electrons to 1 part in 6 x 10 5 and equal for neutrons and binding energies to 
1 .part in 1.2 x 10 4 . To this accuracy, an observer in a freely falling coordinate 
system will detect no gravitational force on neutrons, hydrogen, or their binding 
energies. It would be difficult to conceive of a theory that satisfies this requirement 
and does not go all the way to the strong principle (that no gravitational effects of 
any sort can be felt in a locally inertial frame). 

We might, however, distinguish two versions of the strong principle of 
equivalence, a “very strong principle,” which applies to all phenomena, and a 
“medium-strong principle,” which applies to all phenomena except gravitation 
itself. Certainly the experiments of Eotvos and Dicke are not accurate enough to 
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say whether gravitational binding energies affect inertial and gravitational masses 
in the same way. This question might be settled by studying the motion of a small 
body in orbit about a large body that is itself in free fall in a gravitational field. 
For instance, the gravitational binding energy of the earth contributes a fraction 
— 8.4 x 10“ 10 of its total mass, whereas the gravitational binding energy of an 
artificial satellite contributes a very much smaller fraction of its mass. Thus, if 
(to take an extreme case) the (negative) gravitational binding energy contributes 
fully to the inertial mass but not at all to the gravitational mass, then the ratio of 
gravitational to inertial mass of the satellite would be greater than that for the 
earth by a fraction 8.4 x 10“ 10 . The earth is in free fall, with the gravitational 
attraction of the sun balanced by the inertial force owing to the earth’s revolution. 
The gravitational and inertial forces on the satellite owing to the presence of the 
sun and the earth’s revolution are equal (neglecting for a moment the distance 
between the satellite and the earth’s center of mass) to the gravitational and 
inertial forces on the earth times the ratio of gravitational or inertial masses, so 
these two forces are not in balance for the satellite, the gravitational force being 
greater than the inertial force by a fraction 8.4 x 10“ 10 . The acceleration owing 
to the sun’s gravity is at the orbit of the earth about 6 x 10“ 4 of the acceleration 
owing to the earth’s gravity at the surface of the earth, so we conclude that if the 
gravitational binding energy of the earth contributed fully to its inertial mass but 
not at all to its gravitational mass, then an artificial satellite in a low orbit about 
the earth would feel an effective attraction toward the sun equal to about 
5.4 x 10~ 13 times its gravitational attraction toward the earth. This tiny effect 
would be entirely masked by a ‘Tidal” force because the satellite is far from the 
center of mass of the earth, and there is no prospect of its being measured. This is a 
pity, because it is precisely the very strong assumption that the Principle of 
Equivalence applies to gravitational fields that will lead us in Chapter 5 to 
Einstein’s field equations for gravitation. 


2 Gravitational Forces 


Consider a particle moving freely under the influence of purely gravitational 
forces. According to the Principle of Equivalence, there is a freely falling coordinate 
system in which its equation of motion is that of a straight line in space- time, 
that is, 


2 ^ 


dr' 


= 0 


(3.2.1) 


with dr the proper time 


dr 2 = -q^d^d^ 


(3.2.2) 


[Compare Eqs. (2.3.1) and (2.1.4).] Now suppose that we use any other coordinate 
system of, which may be a Cartesian coordinate system at rest in the laboratory, 
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but also may be curvilinear, accelerated, rotating, or what we will. The freely 
falling coordinates are functions of the x **, and Eq. (3.2.1) becomes 

dr dr J 

_ d£ dV d 2 £“ dx» dx^ 

8x u dj 2 dx* 3x v dz dr 


Multiply this by 8x'-jdt; a , and use the familiar product rule 


di^ckc 2 
dx* Si* 


This gives the equation of motion 

. d 2 x x dx ** dx v 

0 = — - + r x 

dr 2 dr dr 


where T x v is the affine connection, defined by 


dx x d 2 r 
dd a dxF dw v 


(3.2.3) 


(3.2.4) 


The proper time (3.2.2) may also be expressed in an arbitrary coordinate 
system, 


7 2 r 

dr = — tjgff — _ dx v dx' 


dx ^ dx' 


(3.2.5) 


or 

dr 2 = —g flv dx* l dx v 
where g^ v is the metric tensor , defined by 

= d^d^_ 

9uv ~ dx*dx* ^ 


(3.2.6) 


(3.2.7) 


For a photon or a neutrino the equation of motion in a freely failing system is 
the same as (3.2.1), except that the independent variable cannot be taken as the 
proper time (3.2.2), because for massless particles the right-hand side of (3.2.2) 
vanishes. Instead of r we can use a = £°, so that (3.2.1) and (3.2.2) become 


0 = ->l«o 


dl?dl? 

da da 
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Following the same reasoning as before, we find that the equation of motion in an 
arbitrary gravitational field and an arbitrary coordinate system is 


d 2 x M ^ dx v dx x 
da 2 vk da do 


0 = —9fiv 


dx*dx v 
da da 


(3.2.8) 

(3.2.9) 


with and g^ v given as before by (3.2.4) and (3.2.7). 

Incidentally, in both (3.2.3) and (3.2.8) we do not need to know what r and fl- 
are in order to find the motion of our particle, for these equations when solved give 
x^(r) or x fi (a), and t or a can be eliminated to give x(t). The purpose of (3.2.6) ,is to 
tell us how to compute the proper time, whereas the purpose of (3.2.9) is to impose 
initial conditions appropriate to a massless particle. In particular, Eq. (3.2.9) tells 
us that the time dt for a photon to travel a distance dx is determined by the 
quadratic equation 

0 = g 00 dt 2 -h 2 g i0 dx 1 dt + g tj dx 1 dx j 


with i and j summed over the values 1, 2, 3. The solution is 


* = — [ — <7,„ dx 1 - {(g M g J0 - gij g 00 ) dx‘ dx^ 1 ] (3.2.10) 

@oo 

and the time required for light to travel along any path may be calculated by 
integrating dt along the path. 

The values of the metric tensor g and the affine connection at a point X 
in an arbitrary coordinate system x M provide enough information to determine the 
locally inertial coordinates £ tt (x) in a neighborhood of X. First, we multiply 
Eq. (3.2.4] by d£ fi ldx x and use the product rule 

dx x di“ 


thereby obtaining the differential equations for 

g 2 c“ = r * er 

ex'* dx v " v Sx 1 

The solution is 


where 


{>) = a* + b\{x» - X *) 

+ ~ ^)(* v - X v ) + • • • 


a* = £«(X) 


d^(X) 

dX x 


(3.2.11) 


( 3 . 2 . 12 ) 

(3.2.13) 
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From Eq. (3.2.7) we also learn that 

<!//, = (3.2.14) 

Thus, given F* v and g fiV at X, the locally inertial coordinates c a are determined to 
order ( x — X) 2 , except for the ambiguity in the constants a* and b a k . The b a k are 
determined by Eq. (3.2.13) up to a Lorentz transformation b*^ -+ A a ^ M , so the 
ambiguity in the solution for iz^ix) just reflects the fact that if c* are locally inertial 
coordinates, then so are A a ^ + c a . Hence, since T^ v and g determine the locally 
inertial coordinates up to an inhomogeneous Lorentz transformation, and since the 
gravitational field can have no effects in a locally inertial coordinate system, we 
should not be surprised to find that all effects of gravitation are comprised in 
T x v and g^ v . Note, however, that (3.2. 12) satisfies (3.2.11) only at the point x = X ; 
in order for it to be possible to solve (3.2.11) for all x, it is necessary for the deriv- 
atives of the affine connection to satisfy certain symmetry conditions, to be 
discussed in Chapter 5. 


3 Relation between g and T x v 

Our treatment of freely falling particles has shown that the field that deter- 
mines the gravitational force is the “affine connection” F^ v , whereas the proper 
time interval between two events with a given infinitesimal coordinate separation 
is determined by the “metric tensor” g flv . We now show that g^ is also the 
gravitational potential; that is, its derivatives determine the field F^ v . 

We first recall the formula for the metric tensor, Eq. (3.2.7): 

_ 

^ “ dx" dx ’ ^ 

Differentiation with respect to x x gives 

dg^ = d? d? d 2 ? 

dx x dx x dx * dx v ^ dx? dx x dx v n 


and recalling (3.2.11), we have 


89m 

dx k 


= T p xrw n 

** dx" dx v ^ 


, r" d ^„ 

^dx^dx"^ 


Using (3.2.7) again, we find 


xr = ru- + 


(3.3.1) 
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Before solving for T, it is necessary to point out a subtlety in the derivation of 
Eq. (3.3.1) that has been hidden by our too-compact notation. When we erect a 
locally inertial coordinate system £ a ( x ), we do so at a specific point X, and the 
coordinates that are locally inertial at X should be so labeled, as £ x *{x). Thus 
Eqs. (3.2.7) and (3.2.11) should properly be written as 


SVvW 


' bi* x (x) 8j{(x) 
bod 1 bx v 




x = X 






. ) x=x 


(3.3.2) 

(3.3.3) 


When we differentiate (3.3.2) with respect to X x , we get two kinds of terms. The 

first kind arises because we set x — X ; these contain just the second derivatives 
(3.3.3) and can be easily calculated as before. The second kind of term arises 
because ^\{x) carries a label X ; these terms contain derivatives like 


bX i bxf J x=x 


(3.3.4) 


and do not seem to have anything to do with the metric or the affine connection. 
In order to deal with this second kind of term it is necessary to sharpen somewhat 
our interpretation of what is meant by “locally inertial” in the Principle of 
Equivalence. We shall see in Section 5 that first derivatives of the metric tensor 
may be measured by comparing the rates of identical clocks an infinitesimal space- 
time distance apart. Hence we shall interpret the Principle of Equivalence as 
meaning that the locally inertial coordinates that we construct at a given point X 
can he chosen so that the first derivatives of the metric tensor vanish at X. In the 
coordinate system the metric tensor at a point X ' is given by (3.3.2) as 

dt; v x (x) d£ x (x) ) x =x' 



and our new interpretation of the Principle of Equivalence tells us that this 
quantity is stationary in X' at X f — X. In order to use this information, we 
introduce an arbitrary “laboratory” coordinate system x and write 


\ (lx 


(X) d&{*) 




bod 1 bx' 


x = X' 


= oU* 


( by x {x) afl(s) \ 

V bod 1 bz* J x=x . 
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Differentiating with respect to X fA and setting X' = X gives then (because 
gf & (X') is stationary) 


ft. ,W = „x, X) (JL f afifc) difa) 
gx x ^ | d3f 8x * 


x = X 


= Vy5 


d 2 ix( x ) d£x( x ) + d&W d 2 Zx(x) \ 
dx k dx“ dx v dx* ux x dx v J x=x 


No derivatives like (3.3.4) now appear, and we can use (3.3.2) and (3.3.3) as before 
to show that 

+ r UX)gjx) 


which is precisely Eq. (3.3.1). 

Now we return to our previous compact notation, and solve for the affine 
connection. Add to Eq. (3.3.1) the same equation with g and X interchanged and 
subtract the same equation with v and X interchanged. We have then 


d l± 

dx 


+ d lAx 

dx* 


— n t" k \ n r* 
^7 — y K v l kn + y K n l av 

+ 

9k^ Vfl 

= 2 ^ v r^ 


(3.3.5) 


(Recall that T* v and g AiV are symmetric under interchange of p and v.) Define a 
matrix g v<T as the inverse of g va , that is, 


Ao = 

and multiply the above with g va ; this gives finally 


= ig va 


, d 9k V dg»k 

dx k dx v dx v 


(3.3.6) 


(3.3.7) 


[It should be noted that (3.2.7) ensures that the metric tensor does have an inverse, 
given by 


y 1 d^d^ 


(3.3.8) 


for, using the familiar product rule 
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we find 


^ dPd? 

9 g KV n ^p^y 5 dx * dx v 


,*p 


dx c 


n d^'' ya 


d P 

dx K 


di f dx * 


as required by (3.3.6).] Occasionally the right-hand side of Eq. (3.3.7) is called a 
Christoffel symbol and denoted 


'M 


One important consequence of the relation between the affine connection and 
the metric tensor is that the equation of motion of a freely falling particle 
automatically maintains the form of the proper time interval dr. Using (3.2.3) we 
may calculate that 

d dx 11 dx v | _ dg^ v dx 2 dx* 1 dx v 

dr dr dr ( dx 2 dr dr dr 


d 2 xP dx v 
+ dr 2 dr 


+ 0 V V 


dx M d 2 x v 
dr dr 2 



fix 


i r v 

'vie 1 aX 


dx K dx a dx 2 
dr dr dr 


and (3.3.5) tells us that this vanishes, that is, 


9fiv 


dx * dx v 
dr dr 


= ~C 


(3.3.9) 


where C is a constant of the motion. Hence, once we choose initial conditions such 
that dr 2 is given by (3.2.6), we have C = 1, and (3.3.9) will ensure that (3.2.6) 
continues to hold along the particle’s path. Similarly, for a massless particle the 
initial conditions give (7 = 0 (with r replaced by some other parameter a) and the 
equations of motion will keep g dx* dx v zero along the path. 

An additional consequence of the relation (3.3.5) is that we are enabled to 
formulate the law of motion of freely falling bodies as a variational principle. 
Let us introduce an arbitrary parameter p to describe the path, and write the 
proper time elapsed when the particle falls from point A to B as 


T 


BA 



dx M dxA 1/2 
'»'dpT P 1 dp 
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Now vary the path from x*{p) to x*(p) + Sx*{p), keeping fixed the endpoints, 
that is, setting Sx* — 0 at p A and p B . The change in T BA is 


ST 


BA 



dx* dx v 


- 1/2 


Sg u y 5x i 

dx A dp dp 


2 ^V 


dSx * dx v 
dp dp 


dp 


The first factor within the integrand is just dpjdx, so the integral can be rewritten as 


<5 T ba = 


- -B 


2 8^ 


_ j dx * dx v 

Sx 2 + 

dr dr 


dSx * dx\ 

V dx 

dx dx 


We now integrate by parts, neglecting the endpoint contributions because dx* 
vanishes at A and B. This gives 

dg dx* dx v dg Xv dx° dx v d 2 x v 

2 dx 1 dx dx 


W BA = 


-B 


dx° dx dx ^ Av dx 2 


Sx x dx 


Inserting Eq. (3.3.5) and recalling that T x v is symmetric in its lower indices, we find 


ST 


BA 


= _ rn 

Ll dx 2 


dx* dx a \ x 

dx dx 


(3.3.10) 


Hence the space-time path taken by a particle that obeys the equations (3.2.3) 
for free fall will be such that the proper time elapsed is in extremum (and usually a 
minimum), that is, 


<5 T ba = 0 


We may therefore express the equations of motion (3.2.3) geometrically, by saying 
that a particle in free fall through the curved space-time called a gravitational field 
will move on the shortest (or longest) possible path between two points, “length” 
being measured by the proper time. Such paths are called geodesics. For instance, 
we can think of the sun as distorting space-time just as a heavy weight distorts a 
rubber sheet, and can consider a comet’s path as being bent toward the sun to 
keep the path as “short” as possible. However, this geometrical analogy is an 
a posteriori consequence of the equations of motion derived from the equivalence 
principle, and plays no necessary role in our considerations. 


4 The Newtonian Limit 

To make contact with Newton’s theory, let us consider the case of a particle 
moving slowly in a weak stationary gravitational field. If the particle is sufficiently 
slow, we may neglect dxjdx with respect to dtjdx , and write (3.2.3) as 
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Since the field is stationary, all time derivatives of g vanish, and therefore 


r « 0 = -i/ v 


dffoo 

dx v 


Finally, since the field is weak, we may adopt a nearly Cartesian coordinate system 
in which 

9 a p = Wap + \h a p\ I (3.4.1) 


so to first order in A a/? . 


r« - _ i „‘t> Shoo 
r °° “ 2 " ^ 


Using this affine connection in the equations of motion then gives 


d 2 x 

dr 2 


1 

2 


dt 

dr 


I 


d 2 t 

dr 2 


= 0 


The solution of the second equation is that dtjdr equals a constant (as could also 
be seen by computing dr with h ^ neglected), so dividing the equation for d^xjdr* 
by (dtjdr) 2 , we find 

§ = moo (3.4.2) 

dt 2 

The corresponding Newtonian result is 

~ = -V* (3.4.3) 

dt z 


where 0 is the gravitational potential, which at a distance r from the center of a 
spherical body of mass M takes the form 


0 = “ 


GM 

r 


(3.4.4) 


Comparing (3.4.2) with (3.4.3), we conclude that 

h 00 — — 20 + constant 

Furthermore, the coordinate system must become Minkowskian at great distances, 
so h 00 vanishes at infinity, and if we define (j) to vanish at infinity [as in (3.4.4)], 
we find that the constant here is zero, so h 00 — —2(f), and returning to the metric 
(3.4.1), 


9oo ~ ~ (1 + 20) 


(3.4.5) 
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The gravitational potential 0 is of the order of 10 -39 at the surface of a proton, 
10 ~ 9 at the surface of the earth, 10 _ 6 at the surface of the sun, and 10 “ 4 at the 
surface of a white dwarf star, so evidently the distortion in produced by 
gravitation is generally very slight. (In c.g.s. units 0 has the dimensions of a 
squared velocity; in our units 0 is the c.g.s. value divided by the square of the 
c.g.s. speed of light.) 


5 Time Dilation 


Consider a clock in an arbitrary gravitational field, moving with arbitrary 
velocity, not necessarily in free fall. The equivalence principle tells us that its rate 
is unaffected by the gravitational field if we observe the clock from a locally 
inertial coordinate system £*, so according to Section 2.2, the space-time interval 
d( x between ticks is governed in this system by 

A * = (- n **?#?) 1 ' 1 

where At is the period between ticks when the clock is at rest in the absence of 
gravitation. Hence in any arbitrary coordinate system the space-time interval 
between ticks will be governed by 


At 


dx 11 — dx 

dx * dx v 


1/2 


or, introducing the metric tensor (3.2.7), 

At = (-£ mv dx* dx v ) 1/2 


If the clock has velocity dx^/dt, then the time interval dt between ticks will be 
given by 


dt / dx M dx v \ 1/2 

At V dt dt / 


(3.5.1) 


In particular, if the clock is at rest this becomes 


dt 

At 


= (~<7oor 1/2 


(3.5.2) 


We cannot observe the time dilation factors appearing in (3.5.1) and (3.5.2) 
by merely measuring the time interval dt between ticks and comparing with the 
value specified by the manufacturer, because the gravitational field affects our 
time standards in exactly the same way as it affects the clock being studied. That 
is, if our standard clock says that a certain physical process takes 1 sec at rest in the 
absence of gravitation, then it will also tell us that it takes 1 sec in the presence of 
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gravitation, both standard clock and process being affected by the field in the same 
way. However, we can compare the time dilation factors at two different points in a 
field. For instance, suppose that at point 1 we observe the light coming from a 
particular atomic transition at point 2. If points 1 and 2 are at rest in a stationary 
gravitational field, then the time taken for a wave crest to travel from 2 to 1 will 
be a constant, given by the integral of (3.2.10) over the path, and therefore the time 
between the arrival at point 1 of successive crests will equal the time dt 2 between 
their departure at point 2, given by (3.5.2) as 

dt 2 = At(-g 00 (x 2 ))-' 12 

If the same atomic transition occurs at point 1, then the time between crests of the 
light waves to be observed at point 1 will be 

*1 = At( — goo( x i ))~ 1/12 


Hence, for a given atomic transition, the ratio of the frequency (observed at point 
1 ) of the light from point 2 to that of the light from point 1 will be 


y _i _ l ffoot^ V^ 2 
Vi Woo^i)/ 


(3.5.3) 


In the weak field limit g 00 ~ —1 — 2(p and (f> <£ 1, so v 2 /v x = 1 + Av/v, where 


- = cj>(x 2 ) - 4>(x t ) (3.5.4) 

v 


(For a uniform gravitational field, this result could be derived directly from the 
Principle of Equivalence, without introducing a metric or affine connection.) 

Let us apply Eq. (3.5.4) to the case of light from the sun’s surface observed on 
the earth. The sun’s gravitational potential can be calculated as 

-GM 0 

4 >° = — — 

where J/ c and R Q are the sun’s mass and radius, 

M 0 = 1.97 x 10 33 g 
i?o = 0.695 x 10 6 km 
and G is the gravitational constant 

G = 6.67 x 10“ 8 erg cm/gm 2 = 7.41 x 10” 29 cm/gm (3.5.5) 

(Here we have used our convention that c — 1 to set 1 sec = 3 x 10 1 0 cm ; in c.g.s. 
units the quantity 7.41 x 10“ 29 cm/gm would have to be called (7/c 2 .) We find 
that the potential on the surface of the sun is 


<p 0 = -2.12 x 10" 6 
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The gravitational potential of the earth is negligible in comparison, so ideally the 
frequency of light from the sun should be shifted to the red by 2.12 parts per 
million as compared with light emitted by terrestrial atoms. 

The difficulty in measuring the solar gravitational red shift can be appreciated 
if we reflect that a motion of the source by a velocity v along the earth-sun direction 
will produce an additional Doppler frequency shift Av/v = v [recall Eq. (2.2.2)], 
so the gravitational red shift can be masked by a velocity v = 2 x 10” 6 , or in 
c.g.s. units, v = 0.6 km/sec. It is not the rotation of the earth or the sun that 
bothers us ; these are known effects which can easily be taken into account. Thermal 
effects are more serious ; at a temperature of 3000°K the thermal velocity of typical 
light elements ( C , N , 0) is about 2 km/sec, giving a Doppler broadening about 
three times larger than the expected gravitational red shift. However, thermal 
motions only broaden lines, not shift them, so they too can be lived with. The 
really bothersome problems arise from unknown Doppler shifts owing to the 
convection of gases in the solar atmosphere. In fact, the frequency shift is observed 
to vary from place to place on the solar disk, and is occasionally even toward the 
blue ! The convection tends to be vertical, so we can minimize the Doppler shifts it 
produces by looking at the limb of the sun, where the motion is mostly at right 
angles to the line of sight. Until recently the best result that could be achieved in 
this way was that the solar gravitational red shift is of the order of 2 parts per 
million. 2 In the last few, years improved observational techniques 3 have yielded a 
much better value of the red shift, equal to 1.05 + 0.05 times the predicted value. 
However, it is too early to say that this result closes the story, at least until it can 
be corroborated. 

The red shifts are much larger for white dwarf stars like Sirius B and 40 
Eridani B. Such stars have masses typically of the order of one solar mass, and 
radii of the order of 1/10 to 1/100 the sun’s radius, so the red shift of spectral lines 
from their surface is about 10 to 100 times greater than for the sun, or roughly 
1 part in 10 4 to 10 5 . Although this alleviates problems arising from convective 
Doppler shifts or temperature or pressure, a new difficulty enters here: It is 
difficult to determine the value of the gravitational potential </> with which to 
compare the measured value of Av/v. If we know the mass of a white dwarf star, 
we can deduce a rough value for its radius and surface gravitational potential from 
astrophysical theory, 4 but the only white dwarf stars whose mass can be measured 
are members of binary star pairs. For instance the mass of Sirius B is determined 
by calculating the total mass of Sirius A and B from their separation and period, 
and then subtracting the mass of Sirius A calculated from stellar theory. However, 
the scattering of light from Sirius A by the atmosphere of Sirius B makes the 
gravitational red shift on Sirius B very difficult to measure. 13 On the other hand, 
40 Eridani B is sufficiently far from 40 Eridani A so that scattering of light is no 
problem, and the mass of 40 Eridani B can be determined separately from that 
of A by locating their center of mass in addition to measuring their period and 
separation. However, because 40 Eridani B and A are so far apart, their period of 
revolution is very long, and there has not yet been time to determine the mass of B 
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very accurately. The best predicted value of the surface gravitational potential is 
<t> = -(5-7 + 1) x 10 5 , in good agreement with the observed 5 red shift 
Av/v = — (7 ± 1) x 10“ 5 . Taking account of Stark shifts in the spectrum of 
40 Eridani B appears to improve the agreement. 511 

The empirical evidence for the red shift predicted by the Principle of 
Equivalence was much improved in 1960, by a terrestrial experiment performed by 
Pound and Rebka. 6 They allowed a y-ray emitted by a 14.4 keV, 0.1 psec transition 
in Ee 57 to fall 22.6 m, and observed its resonant absorption by an Ee 57 target. 
(Normally resonant absorption is impossible for such a narrow y-ray line, because 
the recoil of the emitting nucleus lowers the y-ray energy below the nuclear energy 
difference, whereas to produce the inverse transition in the target nucleus, which 
also recoils, a little more energy than the nuclear energy difference is needed. 
This experiment was made possible by the Mossbauer effect, 7 in which the recoil 
momentum on emission and absorption is taken up by the whole crystal, so that 
essentially no energy is lost to recoil on emission or absorption.) The difference in 
the gravitational potential from top to bottom is 

(980 cm/sec 2 )(2260 cm) 

= K P - 4> bottom = (iVl^cm/sec) 1- 

= -2.46 x 10“ 15 

and if the equivalence principle is correct we would expect the photon arriving at 
the target to be shifted upward in frequency by an amount Av/v = — A cj), lowering 
the counting rate by a factor 


r 2 

c = — 

Av 2 + r 2 

where T is the full width of the y-ray line at half-maximum. (Note that T appears 
here rather than T/2, because we have to fold together an emission coefficient 
proportional to [(v + Av) 2 -fi (r/2) 2 ]' 1 with an absorption coefficient proportional 
to [v 2 + (E/2) 2 ] -1 ). But in this transition the fractional width was T/v = 1.13 x 
10“ 1 2 , which is larger than the predicted Av/v by a factor of 460, so the reduction in 
the counting rate was by only 1 part in 2.1 x 10 5 ! This would seem to make the 
experiment impossible, and indeed Pound and Rebka had originally thought that 
they might have to let the y-ray fall several kilometers in order to get a frequency 
shift Av comparable with F, but happily they thought of a trick which let them 
measure very small frequency shifts. Their idea was to move the y-ray source up 
and down with velocity v 0 cos cot, where co is some arbitrary fixed frequency 
(10 — 50 cps) and v 0 is also arbitrary, but much greater than — A <j>, that is, much 
greater than 7.4 x 10“ 5 cm/sec. To the gravitational violet shift Av c there is then 
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added a larger Doppler shift Av D /v — —v 0 cos cot (see Section 2.2), so the counting 
rate is reduced by a time-dependent factor 


C(t) = 


(Av g + Av D ) 2 + D 


r 2 

V 


Av g a 2 ^ /r X2 

v 0 COS cot 1 + 



and Av g can be picked out by looking for a term linear in cos cot, for instance, by 
measuring the asymmetry between the number of counts registered when the 
source is going up (for example, cos cot > l/y/2) and down (cos cot < — 1/V2). 
In this way Pound and Rebka obtained a value for A v G /v about four times larger 
than the expected value 2.46 x 10“ 1 5 . This discrepancy was actually an intrinsic 
frequency shift owing to the difference between the source and target crystals 
(including differences in their temperature) and was removed by subtracting the 
asymmetry in y-ray counts when the source is below" the target from the asymmetry 
when the target is below the source. The final result for the gravitational frequency 
shift was Av/v = (2.57 + 0.26) x 10“ 15 , in excellent agreement with the pre- 
dicted value 2.46 x 10“ 15 . The agreement between theory and experiment has 
since 8 been improved to about 1 percent. 

There have also been proposals 8a to measure the gravitational red shift of 
light from an artificial satellite. At a point directly below" perigee there is no first- 
order Doppler shift because the time for the light to reach us from the satellite is 
momentarily constant. In this case the frequency shift of the emitted light must be 
determined from (3.5.1), whereas the frequency shift of our laboratory time 
standards may be calculated from (3.5.2), if we ignore the rotation of the earth. It 
follows that the frequency v s of a given atomic line from the satellite will be related 
to the frequency v e of the same line on earth by 




f dx-dxy 12 

v 9 “ v ~di~df) s 

( ” £oo )© 2 


(3.5.6) 


The velocity v s of the satellite is given by 


GM 9 
Rq + H 


K = ~<t>s = 
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where H is the altitude of the satellite and M e and R 0 are the mass and radius of 
the earth, 


M m - 5.983 x 10 27 g 
Rn = 6.371 x 10 8 cm 


In the weak field approximation we have 
dx ** dx 




dt dt 


~ 1 


~(9oo)s - V s = 1 + Hs - % 

3 GM m 


R* + H 


and 


( 9cc)@ — 1 + — 1 


2 GM { 
R m 


so to this order Eq. (3.5.6) gives a frequency ratio vjv e = 1 -f Av/v, where 


Av 

v 


3 GM t 


GM, 


2 R m + H R, 


~ -3.47 x 10 


1 o 


3 R, 




- 2 


We see that at low altitudes there is a purely special-relativistic red shift (see 
Section 2.2), to which is added at higher altitudes a general-relativistic violet shift, 
yielding a net red shift for H <: and a net violet shift for H > R@l%- 

Incidentally, the gravitational red shift of light rising from a lower to a higher 
gravitational potential can to some extent be understood as a consequence of 
quantum theory, energy conservation, and the “weak” Principle of Equivalence. 
When a photon is produced at point 1 by some heavy nonrelativistic apparatus, 
an observer in a locally inertial coordinate system moving with the apparatus will, 
see its internal energy and hence its inertial mass change by an amount related to 
the photon frequency v x he observes, that is, by 


A m 1 -- —hv 1 

where h = 6.625 x 10“ 27 erg sec is Planck's constant. Suppose that the photon 
is then absorbed at point 2 by a second heavy apparatus ; an observer in a freely 
falling system will see the apparatus change in inertial mass by an amount related 
to the photon frequency v 2 he observes, that is, by 


Am 2 = hv 2 
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However, the total internal plus gravitational potential energy of the two pieces of 
apparatus must be the same before and after these events, so 

0 = A m 1 

and therefore 

^2 = 

Vi 

in agreement with our previous result. {Also, it makes no difference whether the 
photon frequencies are measured in locally inertial systems, because the gravita- 
tional field in any other frame will affect the rate of the observer’s standard clock 
in the same way as it affects the v’s.) This result can be interpreted as saying that a 
photon in a gravitational field has “kinetic energy” hv and “potential energy” 
hv(j), their sum remaining constant. However, I have insisted on including a non- 
re lativistic emitter and absorber in the above calculation, because the concept of 
gravitational potential energy for a photon is otherwise without foundation. 

This derivation rests on the Principle of Equivalence in two respects: It 
assumes that the change in gravitational mass of the apparatus equals the change 
in its inertial mass and hence its internal energy; and it also assumes that in a 
freely falling frame the relation between photon energy and frequency is unaffected 
by the presence of gravitational fields. Hence even if we suppose that the Eotvos- 
Dicke experiments could improve to an unlimited accuracy, and that gravitational 
mass were found to equal inertial mass exactly, still there would be some point in 
verifying the gravitational red shift of spectral lines, as an independent test of the 
Principle of Equivalence. 


+ cf) 1 Am 1 -f A m 2 + 0 2 A m 2 
i + fa 


i + </> : 


— 1 + — 


6 Signs of the Times 


The relation between the Minkowski metric and the metric tensor g of 
the theory of gravitation may be expressed in a matrix notation 

g = D t tjD (3.6.1) 


where g is for the purposes of this section a 4 x 4 matrix (not a determinant) 
whose elements are the g rj is a matrix whose elements are the and D is the 
matrix 


with D t its transpose 


^ dx? 


K = A* 


(3.6.2) 


It has been tacitly assumed as part of the Principle of Equivalence that the 
transformation from laboratory coordinates of to locally inertial coordinates is 
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nonsingular: that is, is a differentiable function of x* 1 and x ** is a differentiable 
function of c a . It follows that there exists a matrix 


D 


- 1 
fia 


which is reciprocal to D, that is, 


dx 11 




dx? “ 

a{* dx v 


(3.6.3) 


so that D must have nonvanishing determinant 

Det D ^ 0 (3.6.4) 

A transformation of the form (3.6.1) with D having a nonvanishing determinant is 
called a congruence. 

The fact that g is related to t] a(i by the congruence (3.6.1) does not mean that 
the eigenvalues of g are the same as those of rj^, as would be the case if this were 
a similarity transformation. (Indeed, there are no invariant functions of the 
components of the metric tensor, although there are invariant functions of the g 
and their derivatives, as shown in Chapter 6.) However, there is a theorem known 
as Sylvester's law of inertia 9 that states that the numbers of eigenvalues that are 
respectively positive, negative, or zero do not change under such a congruence. 
We conclude then that the metric tensor g^ y must like have three positive 
eigenvalues, one negative eigenvalue, and no zero eigenvalues. It is this property 
of the metric that distinguishes our familiar (3 + 1) -dimensional space-time from 
4-dimensional space, or (2 + 2) -dimensional space-time, or worse. 


7 Relativity and Anisotropy of Inertia 

We have already seen in Section 1.3 that Newton and Mach came to different 
conclusions about the origin of inertia. Newton believed that inertial forces, such as 
centrifugal force, must arise from acceleration with respect to “absolute space,” 
whereas Mach argued that they were more likely caused by acceleration with 
respect to the mass of the celestial bodies. The distinction is not one of metaphysics 
but of physics, for if Mach were right then a large mass could produce small 
changes in the inertial forces observed in its vicinity, whereas if Newton were right 
then no such effect could occur. 

Einstein considered himself a follower of Mach, but in fact the answer given 
by the equivalence principle to the problem of inertia lies somewhere between that 
of Newton and Mach. The inertial frames, that is, the “freely falling coordinate 
systems,” are indeed determined by the local gravitational field, which arises from 
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all the matter of the universe, far and near. However, once in an inertial frame, the 
laws of motion [such as Eq. (2.3.1)] are completely unaffected by the presence of 
nearby masses, either gravitationally or in any other way. For instance, the mass 
of the sun determines the motion of the freely falling earth, but once we fix our 
coordinate frame to the earth we cannot detect the gravitational field of the sun, 
as shown with great accuracy by the experiment of Dicke. (See Section 1.2. 
Actually, the fact that the earth is not an infinitesimal neighborhood means that 
we can detect the sun’s field through tidal effects, as already discussed in Section 
3.1.) The celestial bodies play a role here because the gravitational field equations 
for < 7 ^ v need boundary conditions at infinity, and these are provided by the require- 
ment that at great distances from the sun g merge with the cosmic gravitational 
field produced by all the mass in the universe. We are not yet ready to go into the 
details of the field equations and cosmology, but we can anticipate that the 
gravitational field determined by the mass of the sun and these cosmic boundary 
conditions is such that planetary orbits far from the sun do not precess with 
respect to the typical stars, in agreement with observation. (See Section 15.1.) 

These points are so important that they are worth repeating. In the absence of 
nearby matter, the inertial frames are determined by the mean cosmic gravitational 
field, which is in turn determined by the mean mass density of the stars, so it is not 
surprising that their inertial frames are at rest, or in a state of uniform non- 
rotating motion, with respect to the typical stars. When a large mass like the sun is 
brought close, it changes the inertial frames so that they accelerate toward the 
mass, but the laws of motion in these freely falling frames are still those of special 
relativity, and show no effects of the surrounding mass distribution. In this sense, 
the equivalence principle and Mach’s principle are in direct opposition. 

The issue between Mach and Einstein can be drawn by asking whether in fact 
the presence of large nearby masses does affect the laws of motion, other than by 
determining the inertial frames? Coeeoni and Salpeter pointed out 10 that there is 
a large mass near us, the Milky Way galaxy, and that Mach’s principle would 
suggest slight differences in inertial mass when a particle is accelerated toward or 
away from the galactic center. This was checked experimentally by Hughes, 
Robinson, and Beltran-Lopez, 1 1 and in a similar experiment, by Drever. 12 (See 
Figure 3.1.) Hughes et al. observed the resonant absorption of photons by a Li 7 
nucleus in a 4700 Gauss magnetic field. The ground state has spin 3/2, so it splits 
in a magnetic field into four energy levels, which should be equally spaced if the 
laws of nuclear physics are rotationally invariant. In this case the three transitions 
among neighboring states should have the same energy and the photon absorption 
coefficient should show a single sharp peak at this energy. However, if inertia were 
anisotropic then the four magnetic substates would not be exactly equally spaced, 
and there would be not one but three closely spaced resonance lines. Hughes et al. 
found that no such splitting greater than the line width of 5.3 x 10' 21 MeV 
occurred over a I2-hr period, during which the rotation of the earth carried the 
magnetic field from 22° toward the galactic center to 104° away from the galactic 
center. If we think of the Li 7 nucleus as a single proton with angular momentum 
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Isotropic inertia Anisotropic inertia 

Figure 3.1 The Li 7 absorption spectrum as a test of the isotropy of inertia (frequency 
differences and line splitting greatly exaggerated). 


3/2 bound by a central potential to the other nucleons, then the anisotropy Am 
in the proton mass must be such that 


A 


p 

2m 


^ < 5.3 x 10— MeV 

rn \ 2 ml 


where p 2 j2m is the proton kinetic energy. Since p 2 j2m is greater than \ MeV, we 
can conclude that the anisotropy in inertial mass is subject to the inequality 

Am < 10 - 2 o 

m 

At least in this regard, the evidence strongly favors the equivalence principle rather 
than Mach’s principle. 
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“In Mathematicks he was greater. 

Than Tycho Brahe or Erra Pater: 

For he by Geometrick scale 
Could take the size of Pots of Ale ; 
Resolve by Signes and Tangents straight, 
If Bread or Butter wanted weight ; 

And wisely tell what hour o’th day 
The Clock does strike, by Algebra.” 
Samuel Butler , Sir Hudibras , 

His Passing Worth 


4 TENSOR 
ANALYSIS 


We have already noticed that the Principle of Equivalence of Gravitation and 
Inertia establishes a deep analogy between non-Euclidean geometry and the theory 
of gravitation. This chapter is devoted to an outline of the language common to 
both, that of tensor analysis. 


1 The Principle of General Covariance 

In the last chapter we demonstrated one way of using the Principle of 
Equivalence to assess the effects of gravitation on physical systems: We wrote 
down the equations that hold, for general gravitational fields, in locally inertial 
coordinate systems (i.e., the equations of special relativity, such as d 2 ^ x jdr 2 — 0) 
and then performed a coordinate transformation to find the corresponding equa- 
tions in the laboratory coordinate system. We could continue to follow this 
approach, but it would lead us into very tedious calculations when we come to the 
field equations for electromagnetism and gravitation. Instead, we shall follow a 
different method, one that is of precisely the same physical content, but is much 
more elegant in appearance and convenient in execution. This method is based on 
an alternative version of the Principle of Equivalence, known as the Principle of 
General Covariance. It states that a physical equation holds in a general gravita- 
tional field, if two conditions are met : 
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1 . The equation holds in the absence of gravitation ; that is, it agrees with the 
laws of special relativity when the metric tensor g a/f equals the Minkowski tensor 
g x p and when the affine connection vanishes. 

2. The equation is generally co variant; that is, it preserves its form under a 
general coordinate transformation x -» x'. 

To see that the Principle of General Covariance follows from the Principle of 
Equivalence, let us suppose that we are in an arbitrary gravitational field, and 
consider any equation that satisfies the two above conditions. From (2), we learn 
that the equation will be true in all coordinate systems if it is true in any one 
coordinate system. But at any given point there is a class of coordinate systems, 
the locally inertial systems, in which the effects of gravitation are absent. Condition 
(1) then tells us that our equation holds in these systems, and hence in all other 
coordinate systems. 

It should be stressed that general covariance by itself is empty of physical 
content. 1 Any equation can be made generally covariant by writing it in any one 
coordinate system, and then working out what it looks like in other arbitrary 
coordinate systems. Indeed, from childhood we have become familiar with the 
appearance of physical equations in non-Cartesian systems, such as polar co- 
ordinates, and in noninertial systems, such as rotating coordinates. The significance 
of the Principle of General Covariance lies in its statement about the effects of 
gravitation, that a physical equation by virtue of its general covariance will be 
true in a gravitational field if it is true in the absence of gravitation. 

The meaning of general covariance can be brought forward by comparing it 
with Lorentz invariance. Just as any equation can be made generally covariant, 
so any equation can be made Lorentz-invariant, by writing it in one coordinate 
system and then working out what it looks like after a Lorentz transformation. 
However, if we do this with a nonrelativistic equation like Newton’s second law, 
we find after making it Lorentz-invariant that a new quantity has entered the 
equation, which of course is the velocity of the coordinate frame with respect to 
the original reference frame. The requirement that this velocity not appear in the 
transformed equation is what we call the Principle of Special Relativity, or 
“Lorentz invariance” for short, and this requirement places very powerful restric- 
tions on the original equation. Similarly, when we make an equation generally 
covariant, new ingredients will enter, that is, the metric tensor and the affine 
connection r£ v . The difference is that we do not require that these quantities drop 
out at the end, and hence we do not obtain any restrictions on the equation we start 
with ; rather, we exploit the presence of g^ v and T^ v to represent gravitational fields. 
To put this briefly: The Principle of General Covariance is not an invariance 
principle, like the Principle of Galilean or Special Relativity, but is instead a 
statement about the effects of gravitation, and about nothing else. In particular, 
general covariance does not imply Lorentz invariance — there are generally 
covariant theories of gravitation that allow the construction of inertial frames at 
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any point in a gravitational field, bnt that satisfy Galilean relativity rather than 
special relativity in these frames . la 

Any physical principle, such as general covariance, which takes the form of an 
invariance principle but whose content is actually limited to a restriction on the 
interactions of one particular field, is called a dynamic symmetry . 2 There are other 
dynamic symmetries of importance in physics, such as local gauge invariance, 
which governs the interactions of the electromagnetic field, and chiral symmetry , 3 
which governs the interactions of the pi-meson field. We shall return to the analogy 
between general relativity and electrodynamics several times in following chapters. 

The Principle of General Covariance can only be applied on a scale that is small 
compared with the space-time distances typical of the gravitational field, for it is 
only on this small scale that we are assured by the Principle of Equivalence of 
being able to construct a coordinate system in which the effects of gravitation are 
absent. For instance, the radius of the moon is not so very much smaller than the 
earth-moon separation, so we cannot accurately calculate the motion of the moon 
by finding generally covariant equations that reduce to the correct equations for a 
freely moving moon in the absence of gravitation. We can, however, treat the moon 
as a ball of rock and calculate its motion by applying the Principle of General 
Covariance to determine the gravitational force on each infinitesimal element of 
the lunar mass. 

There are in general many generally co variant equations that reduce to a 
given special-relativistic equation in the absence of gravitation. However, because 
we only apply the Principle of General Covariance on a small scale compared with 
the scale of the gravitational field, we usually expect that it is only g^ iv and its first 
derivatives that enter our generally covariant equations. With this understanding 
we shall see in this and the next chapter that the Principle of General Covariance 
makes an unambiguous statement about the effects of gravitational fields on any 
system, or part of a system, that is sufficiently small. 


2 Vectors and Tensors 

In order to construct physical equations that are invariant under general 
coordinate transformation, we must know* how the quantities described by the 
equations behave under these transformations. For some quantities, those defined 
directly in terms of coordinate differentials, the transformation properties may be 
determined by a straightforward calculation. For other quantities, such as the 
electromagnetic fields, the transformation properties are partially a matter of 
definition. However, there is a tendency for all quantities of physical interest to 
transform in a reasonably simple way, for otherwise it would be difficult to put 
them together to form invariant equations. In this section we describe one class of 
objects whose transformation properties are particularly simple, giving examples 
(where we can) from quantities defined directly in terms of the coordinate sy r stem. 
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The simplest of all transformation rules is that of the scalars, which simply 
do not change under general coordinate transformations. The obvious example is a 
pure number, like 137 or n or zero. Another example is the proper time dx, given by 
Eq. (3.2.6); in fact, we shall see below that the metric tensor g is defined to 
transform in just such a way as to keep dx 2 invariant. 

The next simplest transformation rule is that of a contravariant vector, F M , 
which under a coordinate transformation x ** -> x ,fl transforms into 

fix '* 1 

y f(1 = T/v (4.2.1) 

dx v 

For instance, the rules of partial differentiation give 

fix' 11 

dx ,fl = — dx v (4.2.2) 

dx v 

so the coordinate differential is a contravariant vector. A very closely related 
transformation rule is that of a covariant vector U ^ which, under a coordinate 
transformation xA — > x ,fl transforms into 


u\ = — u, 

* dx "* 


( 4 . 2 . 3 ) 


For instance, if <j) is a scalar field, then dejjjdx 11 is a covariant vector, because in a 
transformed coordinate system the gradient is 


d± = dx^d$ 

dx' 11 dx ; dx v 

in agreement with (4.2.3). 

From the contravariant and co variant vectors we can immediately generalize 
to the tensors. A tensor with upper indices pi, v, . . . and lower indices k, / 1 , . . . 
transforms like the product of contravariant vectors U^W V - • • and covariant 
vectors F K F A --*. For instance, under a coordinate transformation x -> x' a 
tensor will transform into 


T * x _ fa' 11 dx" 8x a TK „ 
v 8x K 8x' y 8x a p 


( 4 . 2 . 5 ) 


If all indices are upstairs the tensor is called contravariant ; if all indices are down- 
stairs the tensor is called co variant; otherwise it is called mixed. The most impor- 
tant example is the metric tensor, defined in Section 3.2 for a general coordinate 
system xA by 


di* dt? 
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where £ a is a locally inertial coordinate system. In a different coordinate system 
x' p the metric tensor is 


9 [tv tfap 


d^_ d 
dx ,fl dx' v 


and therefore 


= n aP 


d? dx p dg dx* 
dx p dx f p dx a dx ,v 


9 ?* 


dx p dx* 
dx ,fl dx' v 


(4.2.6) 


We see that g pv is indeed a covariant tensor. Its inverse is a contravariant tensor, 
for if we define g kp so that 

fan = Si 

we shall have 


dx 11 fix '' 1 , 8x’ 1 8x 8x K 8xi 

qP a n = qP° q 

8xf 8x ’ “ v 8x» 8x° 8x’» 8x ,v " 


and therefore 


dx ,x pK dx n _ _ dx a dx p 

— y — - — — 


dx a dx ,p 
dx p dx G 


g pa = g ap 




(4.2.7) 


as required for a contravariant tensor. Finally, the Kronecker symbol 3 P is a 
mixed tensor, because 


dx ,p dx v 

o p — 

dx p dx ,a 


dx' p dx p 
dx p &d* 


(4.2.8) 


Aside from the scalars and zero, S p (together with its direct products) is the only 
tensor whose components are the same in all coordinate systems. 

A vector is just a tensor with one index and a scalar is a tensor with no 
indices, so it will not generally be necessary to give the scalars and vectors special 
treatment in the following. However, the reader should be warned that not every- 
thing is a tensor; in particular, the affine connection r^, despite its appearance, is 
not a tensor. 

We can now recognize one very large class of invariant equations: Any 
equation will be invariant under general coordinate transformations if it states the 
equality of two tensors with the same upper and lower indices. For instance, if 
A p v x and B p v k are two tensors with the transformation rule (4.2.5), and if in the 
x p coordinate system A p y x — B p v x , then obviously in the x' p coordinate system 
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A ' M / = B fp v A . In particular, since zero is any kind of tensor we want, a statement 
that a given tensor vanishes is invariant under general coordinate transformations. 
In contrast, a statement that is not an equality between tensors of the same kind 
(for instance, T pv — 5 or V* 1 = U ) may be numerically true in a limited class of 
coordinate systems, but not in all coordinate systems. 


3 Tensor Algebra 


The next step in our program of constructing equations invariant under general 
coordinate transformations is to learn how to put tensors together to form other 
tensors. This is accomplished through a few simple algebraic operations: 

(A) Linear Combinations. A linear combination of tensors with the same 
upper and lower indices is a tensor with these indices. For instance, let A ll v and 
B p v be mixed tensors, and let 

T\ = aA\ + bB\ 

where a and b are scalars; then T p v is a tensor, because 


= aA'\ + bB'\ 


dx' p dx a 
dx p dx' v 


A* + b 


faTdx?_ 
dx p far 




cV" 5_X f_ TP 

dx” 8x ,v ' 


(B) Direct Products. The product of the components of two tensors yields a 
tensor whose upper and lower indices consist of all the upper and lower indices of 
the two original tensors. For instance, if A p v and B p are tensors, and 

T\ p = A p y B p 

then T p v p is a tensor, that is, 

T ,pp == A'\B ,p 

dx ,p dx K A dx' p 

= — r A k BA 

dx x dx' v dx a 

= dyrdo^dyr TXa 
dx x dx fV dx a K 

(C) Contraction. Setting an upper and lower index equal and summing it 
over its four values yields a new tensor with these two indices absent. For instance, 
if T p v p<T is a tensor and 

qiUP ^ rpp py 
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then T pp is a tensor, that is, 

rpifip __ rjvp pv 

_ dx '* 8x x dx '» dx’' TK 
ex* dx ' v dx" 8x’ ' 

= Sx^d^ TKni 
ox* dx" k 
= dx^8x 'P TKn 
dx K dx i 

These three operations can of course be combined in various ways. One 
particularly important combined operation results in the raising and lowering of 
indices. If we take the direct product of a contravariant or mixed tensor T with the 
metric tensor g^ v , and contract the index g with one of the contravariant indices of 
T , we get a new tensor in which this contravariant index is replaced by a covariant 
index v. For instance, if T pp a is a tensor, and we define 

3 p ee a T pp 

V <T <7/1 V a 

then by rules (B) and (C), S v p a will be a tensor. Similarly, if we take the direct 
product of a co variant or mixed tensor T with the inverse metric tensor g pv , and 
contract the index g with one of the covariant indices of T, we get a new tensor in 
which this covariant index is replaced by a contravariant index v. For instance, if 
S fl p a is a tensor, and we define 




iTV. 


then JR vp a is also a tensor. Note that lowering an index and then raising it again 
gives back the original tensor; for instance, in the examples cited above, we 
lowered an index on T to get S and then raised it again to get R, so R = T, because 


= 9 , " V 'V<r = 

m\p 




By raising and lowering indices we can write a tensor with N indices in 2 N different 
ways. Since these are all physically equivalent, it is customary to use the same 
symbol for all 2 N tensors, distinguishing them only by their index locations. 

For the sake of completeness, it should be mentioned that the tensor obtained 
by raising one index on the metric tensor g^ v or by lowering one index on the 
inverse metric tensor g"'\ is precisely the Kronecker tensor, because 


/Vi, = <5* 
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Also, raising both indices on g^ v gives the inverse tensor 

= S 

and lowering both indices on g kK gives the metric tensor g^. 

The reader will have noticed that this discussion of tensor algebra is precisely 
the same as the corresponding discussion in the chapter on special relativity 
(see Section 2.5) with one important exception: I have not yet mentioned 
differentiation. This is because the derivative of a tensor is in general not a tensor. 
In Section 6 we shall see that there is a kind of differentiation, called covariant 
differentiation, that provides one more way of constructing tensors from other 
tensors. 


4 Tensor Densities 


Despite the ubiquity of tensors, there is nothing sacred about the tensor 
transformation law. One very important example of a nontensor is the deter- 
minant of the metric tensor 

Q = — Det (4.4.1) 

The transformation rule for the metric tensor can be regarded as a matrix equation 


, dx p 
= 


dx° 

dx' v 


and taking the determinant, we find that 


g f = 


dx 2 
dx' 


(4.4.2) 


where \cxjdx'\ is the Jacobian of the transformation x' — > x; that is, it is the 
determinant of the matrix dx p /dx ,p . A quantity such as g, which transforms like a 
scalar except for extra factors of the Jacobian, is called a scalar density , and 
similarly a quantity that transforms as a tensor except for extra factors of the 
Jacobian determinant is called a tensor density. The number of factors of the 
determinant \dx'/dx] is called the weight of the density; for instance, we see from 
(4.4.2) that g is a density of weight —2, because 


dx 


dx' 

dx' 


dx 


as can be seen by taking the determinant of the equation 


dx p dx' k 
dx' k dx v 


(4.4.3) 
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Any tensor density of weight W can be expressed as an ordinary tensor times a 
factor g~ w/2 . For instance, a tensor density x of rank W has the transformation 
rule 


w dx' p dx K w 

— J l K 

dx k dx ,v 

and using (4.4.2), we see that 

,W/2 p,pL _ W/2 

g '’'-MM ' 9 * 



(4.4.4) 


(4.4.5) 


The importance of tensor densities arises from the fundamental theorem of 
integral calculus, 4 that under a general coordinate transformation x -> x\ the 
volume element d A x becomes 


rfV 



(4.4.6) 


Hence the product of d A x with a tensor density of weight — 1 transforms like an 
ordinary tensor. In particular, \J g d A x is an invariant volume element. 

There is one tensor density whose components are the same in all .coordinate 
systems; it is the Levi-Civita tensor density s pv2K . To define this quantity in a 
general coordinate system, we must arbitrarily order the coordinate indices in a 
reference sequence, for example, x, y, z, t or r, 6 , (p, t, and so on. Then e pv2x is 

/ ivAk even permutation of reference sequence 
pv^K odd permutation of reference sequence (4.4.7) 

some indices equal 


defined by 


f +1 

0 


To see that this is a tensor density, consider the quantity 


dx ,p dx f<T dx' n dx 
dx» dx v dx x dx K 


(4.4.8) 


We note that this is completely antisymmetric in the indices p, a, rj, £ and therefore 
proportional to To determine the proportionality constant, let parjd, take the 
values of the reference sequence; then (4.4.8) is just the determinant \dx'ldx\, and 
so 

dx^dx^dx^dx’S 
dx p dx y dx 1 dx K 8 

Thus s pyXK is a tensor density of weight — 1. We can form an ordinary contra- 
variant tensor by multiplying s pyXK by g~ 1/2 . We can also form a covariant density 
by lowering indices in the usual way, that is, 

w = S'M'dtS'* U-4-10) 


dx' 

dx 




(4.4.9) 
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This is antisymmetric in its indices, and therefore proportional to g pff,,< =. By setting 
p(Tt]^ equal to the reference sequence, we find that the proportionality constant 
must be —g. so 

w = < 4A11 ) 

The reader may easily verify that z ^ is a covariant tensor density of weight — 1. 

The rules of tensor algebra may be easily extended to encompass tensor 
densities. 

(A) The linear combination of two tensor densities of the same weight IT is a 
tensor density of weight W. 

(B) The direct product of two tensor densities of weight W { , W 2 yields a 
tensor density of weight + W 2 . 

(C) The contraction of indices on a tensor density of weight W yields a tensor 
density of weight W. From (B) and (C) it follows that raising and lowering indices 
does not change the weight of a tensor density. 


5 Transformation of the Affine Connection 


Apart from the rather trivial example of the tensor densities, there appears 
throughout the laws of physics one other very important nontensor, the affine 
connection. We recall its definition, 


pi ^ fo* d 2 ? 
* v d£ a dx»dx v 


(4.5.1) 


where ^(x) is the locally inertial coordinate system. Passing from x p to a different 
system x r(i , we find that 

d^dx >p dx fv 

_dx^dx^_d_ /dtf_ <%*\ 

~ dx p dC dx ,p \dx ,v dx a ) 

__ dx a dx p 1 " dx° cx x d 2 ^ d 2 x a 

~ dx p d£* [dx' v cx^ dx x dx a + dx' p dx' v dx a \ 

and referring back to Eq. (4.5.1), this is 


r 


/A 

Hv 


ay dx” r „ ay s 2 x p 

8x p a*'" dx ' v ” 8x p 8x'“ dx ’ v 


(4.5.2) 


The first term on the right is what we would expect if r£ v were a tensor; the second 
term is inhomogeneous, and makes it a nontensor. 
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Tensor analysis provides a very simple way of establishing the relation 
between I^ v and g^ v . Note that 


ff 

dx' K 


. dx p dx° 

dx ,K \® pa dx' p dx f 


dg pa dx x dx p dx a d 2 x p 

lb? dx '« dx^dx" + 9p ° 

d 2 x p dx a 


dx c 


dx ,K dx' p dx' 


so 


dx"* 


9 k 


jf 

dx' 


V + „ , v 


+ 9p 


dx' K 


dx ,K dx ,v dx' p 

* ^ ^ dx ° ( dffo t , d 9 P r 

dx ,K dx' p dx' v \dx p dx a 

0 d 2 x p dx a 
+ 2m 


^9 pc 

dx x 


It follows that 


where 


dx' 2 dx x dx a f p 
dx p dx ,p dx' v 1 ret 


dx’* dx ,v dx ,K 
dx' 2 d 2 x p 

+ 


dx p dx' p dx n 


= i9 


Xk 


' Sg K , Bg^ 

dx p dx v 


dx K 


(4.5.3) 


(4.5.4) 


(4.5.5) 


Subtracting (4.5.3) from (4.5.2), we see that T 2 minus J l is a tensor , 

r r a _ j ^ it = f rP _ j p t " 

L WV W J dx p dx' p dx ,v [_ rff (mj _ 

The equivalence principle tells us that there is a special coordinate system in 
which, at a given point X, the effects of gravitation are absent. In this system there 
can be no gravitational force on free particles, so T„ v vanishes, and there can be no 


gravitational red shift between infinitesimally separated points, so the first 
derivatives of g vanish. Since T p a — J ^ l vanishes in a locally inertial coordinate 

W 

system, and since it is a tensor, it must vanish in all coordinate systems, that is, 


(4.5.6) 


It is useful to have at hand an alternative formula for the inhomogeneous term 
in the transformation rule of rj v . Differentiate the identity 



1 02 


4 Tensor Analysis 


with respect to x’ p \ we find immediately that 

cx a d 2 x p _ _dx p dx a d 2 x' x 
dx p dx ,p dx ,v ~ dx ,v dx ,p dx p dx a 

We can therefore write (4.5.2) as 

r , A = dx^ 8tf_ r „ _ 8x^ 8x°_ 8 2 x a 
dx" 8x' p 8x ' v ” 8x , v dx' p 8x p dx" 


(4.5.7) 


(4.5.8) 


This is just what we would have found by first performing the inverse transforma- 
tion x' -> x, and then solving for T' 

We are now in a position to use the Principle of General Covariance to give an 
alternative proof that a freely falling particle obeys the equation of motion 


where 


d 2 x p ^ rAJ dx v dx x 
dr 2 * vA dx dr 


(4.5.9) 


dr 2 = —g MV dx p dx v 


(4.5.10) 


First, note that Eqs. (4.5.9) and (4.5.10) are true in the absence of gravitation, 
because setting equal to zero and equal to rj^ v gives 

= o dr 2 = -tt dx^dx* 

dr 2 


and these are the correct equations for a free particle in special relativity. Second, 
note that (4.5.9) and (4.5.10) are invariant under a general coordinate transforma- 
tion, for 

d 2 x ,p _ d / dx fp dx v \ _ dx ,(l d 2 x v d 2 x ,p dx x dx v 
dx 2 dx \dx v dx / dx v dx 2 dx v dx k dx dx 


whereas (4.5.8) gives 

dx' a dx ,x dx ,p dx 2 dx p d 2 x ,p dx x dx v 

ax dx dx dx v dx dx dx v dx x dx dx 

Adding these two equations, we find that the left-hand side of Eq. (4.5.9) is a 
vector, that is, 

d 2 x ,p dx^ dx^ _ dx^ f d 2 x K daf dx p \ 

dx 2 vA dx dx dx K \dx 2 ap dx dx J 

Thus Eq. (4.5.9), as well as (4.5.10), is manifestly covariant. The Principle of 
General Covariance then tells us that (4.5.9) and (4.5.10) are true in general 
gravitational fields, because, to repeat the reasoning of Section 1, they are true in 
all coordinate systems if true in any one system, and they are true in the locally 
inertial coordinate systems. 
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t> to variant liitterentiation 

We have already remarked that differentiation of a tensor does not generally 
yield another tensor. For instance, consider a contra variant vector V p , whose 
transformation law is 

y,H _ ^ yv 

dx v 

Differentiating with respect to x' x gives 


dVfff _ dxfff faf_dV^ dV* dx p yv 
dx fX ~ dx v dx a dx p dx v dx p dx ,x 


(4.6.1) 


The first term on the right is what we would expect if d V P ldx x were a tensor ; the 
second term is what destroys the tensor behavior. 

Although dV p jdx x is not a tensor, we can use it to construct a tensor. Using 
Eq. (4.5.8), we see that 


dx ,p dx p 
_dx v dx a 


cffdx<f rv _ d 2 x' p dx p dx a l dx ,K yn 
dx ,K P ° dx p dx a dx a dx' K \ dx" 


d 2 x ,p dx p 
dx p dx a dx a 


(4.6.2) 


Adding (4.6.1) and (4.6.2), we find that the inhomogeneous terms cancel, yielding 


8x a iK dx* 8x' 1 \ 8x“ 


(4.6.3) 


Thus we are led to define a covariant derivative 


+ n,F 


(4.6.4) 


and (4.6.3) tells us that F? A is a tensor: 


v* = 


We can also define a co variant derivative for a co variant vector F„. We recall 


its transformation rule 


Differentiate with respect to x ,v 


y> = y 
P dx ,p P 


frT-dx^dxfdV, d 2 x p 
dx fV ~ dx ,p dx ,v dx a + dx’ p dx rv f 


(4.6.5) 
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From (4.5.2) we have 


T’ k V\ = 

(IV A 


I” dx' 2 dx‘‘ 8x8 


dafW 

. A nrr f w \ 


dx' 


8 2 x ’ 


8x T dx^Sx" 
8 2 x k 


dx K 

dti 2 


dx'^dx" p " “ 5*'“ dx’’' “ 

The inhomogeneous terms will cancel if we subtract (4.6.6) from (4.6.5) : 


F. 


dv u _ r , r = 

&z ,v ' 1V A dx'" dx" 1 5*°’ 


r* v 

A P<T r K 


We therefore define a covariant derivative of a covariant vector 


V = " 

" ;v &z v 


r A v i 

A (iv r A 


and Eq. (4.6.7) tells us that F^. v is a tensor 


= dx° 

^ dx ,p dx ,v p;c 


(4.6.6) 


(4.6.7) 


(4.6.8) 


(4.6.9) 


It is obvious how these definitions are to be extended to a general tensor. 
The covariant derivative with respect to x p of a tensor T : : : equals dT : ::/dx p , plus 
for each contravariant index \i a term given by T p p times T with // replaced with v, 
minus for each co variant index X a term times T with X replaced with k. For 
instance, 


T pc 


x-,p 


dx p 


T"\ + r - r it~ k 


(4.6.10) 


The reader may easily verify that this is a tensor. 

We can also extend the idea of co variant differentiation to tensor densities. 
The easiest way to do this is to recall that if J is a tensor density of weight IF, then 
g w 12 J is an ordinary tensor. Its co variant derivative is also a tensor, and multiply- 
ing by g~ w!1 gives back a tensor density of weight W. Hence the covariant 
derivative of a tensor density of weight IF is defined by 

= g- wl2 (g wl2 Jf (4.6.11) 

and we need not check that this is a tensor density of weight IF. The effect is that 
the covariant derivative with respect to x p of a tensor density J of weight IF is 
constructed just as if it were an ordinary tensor, except that we add an extra term 
( W j2g)J : : :(dg/dx p ). For instance, 


8x f 


I -*■ pv* 


- n,s\ 


W 8g_ 
2 g dx p 




(4.6.12) 
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The combination of covariant differentiation with the algebraic operations 
described in Section 3 gives results similar to those for ordinary differentiation. 
In particular : 

(A) The co variant derivative of a linear combination of tensors (with constant 
coefficients) is the same linear combination of the covariant derivatives. For instance 
if a and f are constants, then 

+ pB\). x = <xA\. x + 0B\. X (4.6.13) 

(B) The covariant derivative of a direct product of tensors obeys the Leibniz 
rule. For instance, 

= A\. p B x + A\B x . p (4.6.14) 

(C) The covariant derivative of a contracted tensor is the contraction of the 
co variant derivative. For instance, setting a = X in Eq. (4.6.10) gives 

= A T* x + r » pv T' x x (4.6.15) 

dx p 

the last two terms canceling. 

We also note that the covariant derivative of the metric tensor is zero, because 
it vanishes in locally inertial coordinates where and dg^/dx* vanish, and a 
tensor, zero in one coordinate system, is zero in all systems. The same result can be 
obtained more directly by noting that 

= “Jy — - T^ v g pfl 

Eq. (3.3.1) tells us that this vanishes: 


W = 0 (4.6.16) 

(This argument can be reversed to provide yet another derivation of the relation 
between g and T pv .) We can also show in the same way that the covariant 
derivatives of the other forms of the metric tensor also vanish, that is, 

g p \ ik = 0 (4.6.17) 

= 0 (4.6.18) 

From (4.6.16)-(4.6.18) it follows that the operations of covariant differentiation 

and raising and lowering indices commute ; for example, 

(^ V F V ) ;A = g^V^ (4.6.19) 

The importance of covariant differentiation arises from two of its properties : 
It converts tensors to other tensors, and it reduces to ordinary differentiation in 
the absence of gravitation, that is, when T p k = 0. These properties suggest the 
following algorithm for assessing the effects of gravitation on physical systems: 
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Write the appropriate special-relativistic equations that hold in the absence of gravita- 
tion, replace with g fiv , and replace all derivatives with covariant derivatives . The 
resulting equations will be generally covariant and true in the absence of gravita- 
tion, and therefore, according to the Principle of General Covariance, they will be 
true in the presence of gravitational fields, provided always that we work on a 
space-time scale sufficiently small compared with the scale of the gravitational 
field. 


7 Gradient, Curl, and Divergence 

There are some special cases where the co variant derivative takes a particularly 
simple form. Simplest of all is, of course, the covariant derivative of a scalar, which 
is just the ordinary gradient 

dS 


s.„ = 


dx ** 


(4.7.1) 


Another simple special case is the covariant curl. Recall that 

dV„ 


V = 


dx : 


ri,r x 


Since is symmetric in p and v, the co variant curl is just the ordinary curl 


F„.„ - F... = d lM- d h 

dx v dx v 


( 4 . 7 . 2 ) 


Another special case that will take a little more work is the covariant diver- 
gence of a contravariant vector 


We note that TJJ A is given by 

pu = v 

= w 


= dV * 
~ dx» 

+ 

> jfypii 

+ d l£i _ 

\dx x 




dx k 





(4.7.3) 


(4.7.4) 


We may evaluate this easily if we recall that for an arbitrary matrix M, 


Tr 1M 1 (x) — - M(x)\ = — - In Det M(x) 


dx 1 I dx 1 


(4.7.5) 
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where Det denotes the determinant and Tr the trace, that is, the sum of the 
diagonal elements. To prove (4.7.5), consider the variation in In Det M owing to a 
variation Sx k in x k : 

8 In Det M = In Det (if + SM) - In Det M 

- In Det + dM ) 

Det M 

= In Det if" 1 (if + SM) 

= In Det (1 + if" 1 SM) 

-> In (1 + Tr if" 1 SM) 

-> Tr if" 1 SM 


Taking the coefficient of Sx 1 in both sides gives Eq. (4.7.5). Applying (4.7.5) to the 
case where if is the matrix g pfl , we find from (4.7.4) that 


™ 1 5 . 1 dr 

rj* = z — ln ? = — — V? 


2 dx‘ 


sjg dx k 


With (4.7.3), we find that the covariant divergence is precisely 


(4.7.6) 


(4.7.7) 


One immediate consequence is a covariant form of Gauss’s theorem : If V M vanishes 
at infinity then 


/% 

A 


d*x-Jg = 0 


(4.7.8) 


Note the appearance here of a factor V g that makes d*x s/ g invariant. 

We can also use (4.7.6) to simplify the formula for the covariant divergence of 
a tensor. For instance, 

~ + rjur*’ + 


rpnv 


and, applying (4.7.6), we find that 

T * V; " = yg^ { ^ 9T ' , '' ) 

In particular, if T^ k = — then the last term drops out, 


Ar. 


y/g daf 


(\jg A^ v ) for A MV antisymmetric 


(4.7.9) 


(4.7.10) 
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There is one other special case of some importance. For a co variant tensor A pv 
the covariant derivative is 


dA p , 

" v;A " 


T xl A pp 


Suppose that A pv is antisymmetric ; that is, 

~^vn 

If we twice add to A k the same tensor with indices cyclically permuted, we find 
by virtue of the symmetry of Fj^ and the antisymmetry of A pv that all T-terms 
cancel, yielding 


+ A x„-,, + A VA;„ = 8 Af + S -~ + 8 -~ for A antisymmetric 
dx OX OX^ 


(4.7.11) 


8 Vector Analysis in Orthogonal Coordinates* 

The reader may be wondering what the tensor analysis formalism outlined in 
this chapter has to do with the familiar formulas for gradient, curl, and divergence 
in the classical curvilinear coordinate systems. These are three-dimensional 
coordinate systems characterized by the condition that g tj is diagonal, that is, 

tli; ■ I'Xij (hj = 1,2,3) (4.8.1) 

where h t is some function of the coordinates. 5 (The summation convention is 
suspended for the duration of this section.) The inverse metric tensor is then 

g ij = hr 2 5 u (4.8.2) 

The invariant proper length is now 

ds 2 = Y j g ij dx l dx J = h l 2 (dx 1 ) 2 -f h 2 2 (dx 2 ) 2 + h 3 2 (dx 3 ) 2 (4.8.3) 

i,j 

and the invariant volume element is 

dV = (Det <7) 1/2 dx 1 dx 2 dx 3 = h i h 2 h 3 dx 1 dx 2 dx 3 (4.8.4) 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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What are usually called the components of a vector V in elementary treatments are 
not the covariant components V t or the contravariant components V 1 , but the 
‘‘ordinary” components V t : 

V t = hfV* = (4.8.5) 


The scalar product of two vectors is then very simple : 

v u = £ gij V‘U J = V 1 U t + V 2 U 2 + V 3 V 3 (4.8.6) 

ij 

[This, of course, is the motivation for the definition (4.8.5).] However, the gradient 
of a scalar is now a little more complicated : 

= = hr ld A (4.8.7) 

dx 1 

The curl of a vector V is likewise defined by taking the “ordinary” components of 
a vector 

(V x V), EE A,£(Det f )-‘'¥'F w 

Jk 

= hi S (^i^2^3) - - h'k^k (4.8.8) 

Jk cx J 


(We have used (4.7.2), since s ljk is antisymmetric in j and k.) Tor instance, the first 
component of the curl is 


(V x V)i = — ( h 3 V 3 - ~ h 2 V~ 
y n h 2 h 3 \dx 2 33 dx 3 2 J 


(4.8.9) 


The divergence of a vector V is nothing but the covariant divergence (4.7.7): 


v • V = I F|, = (Det g) 

i 


1/2 E/i( Det ?) 1/2 ^ 

i dx 


= (Wa) -1 (Jtj + A. * 1 A 3 F 2 + ~ h,h 2 V (4.8.10) 

The Laplacian of a scalar S is the divergence of its gradient 

V 2 S = X (g iJ S. A ),j (4.8.11) 

U 

or combining (4.8.10) with (4.8.7), 

dx 1 h l dx 1 dx 2 h 2 dx 2 dx 3 h 3 dx 3 

The readier may easily check that the usual formulas for gradient, curl, diver- 
gence and Laplacian are obtained if the h t take the forms appropriate for spherical 
or cylindrical coordinates. 


\ 2 S = (MA) -1 
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9 Uovariant Diiterentiation Along a Curve 


This chapter has dealt until now with tensor fields defined over all space-time; 
we now consider tensors T(x) defined only over a curve ^(t). Obvious examples 
come to mind, such as the momentum P**( x) or spin S (r) of a single particle. For 
such tensors it would of course be meaningless to talk about co variant differentia- 
tion with respect to x* 1 , but we can define a covariant derivative with respect to the 
invariant x that parameterizes the curve. 

Consider first a contra variant vector A u (x), with transformation rule 

(4.9.1) 

OX 


It should be noted that the partial derivative dx'^jdx x is to be evaluated at x v = 
x x (x), so that it depends on r. Therefore, differentiating with respect to t, we find 
two terms, 


dA'H x) dx ,fl dA x (x) d 2 x dx k , 

L_: = — + — A (t) 

dr dx x dr dx v dx k dr 


The second derivative d 2 x'^!dx x dx k is the same as that responsible for the inhomo- 
geneous term in the transformation formula (4.5.8) for the affine connection, so we 
are led to define a covariant derivative along the curve x fl (x) by 


W ^ + p. ^ A x 

Dx dr dr 


(4.9.3) 


Eqs. (4.5.8), (4.9.1), and (4.9.2) then show that this is a vector: 

DAA 1 _ dx'* DA* 

Dt dx v Dr 


(4.9.4) 


The similarity between (4.9.3) and the formula (4.6.4) for the covariant derivative 
of a vector field is evident. 

The same considerations lead us to define the covariant derivative along a 
curve x fi (x) of a covariant vector B ( t) by 


OB,. 

Dr 


dr 


o d A B . 

tlV 7 > 

dr 


(4.9.5) 


and with (4.5.2) we can easily show that this is a vector: 

DP; _ 8x v DB V 
Dt ~ dx^ ~Ih 


(4.9.6) 


In the same way, the co variant derivative along a curve ^(r) of a general tensor 
T(t) is defined by adding to dT/dx a term such as that in (4.9.3) for each upper 
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index, and subtracting a term such as that in (4.9.5) for each lower index. For 
instance, 


DT\ 

Dz 


dTff 

dr 


+ r%, — t\ - r 


dr 


Av 


dx k 

dr 


rpti 


(4.9.7) 


and 


DT'\ _dfffidx°_ DT p a 
Dr ~ dx p dx' v Dr 


(4.9.8) 


The properties of covariant differentiation outlined in Sections 6 through 8 can be 
easily extended to co variant derivatives along a curve. 

It should be mentioned that the covariant derivative of a tensor field along a 
curve may be determined from its ordinary covariant derivative ; for instance, if 
T p v is a tensor field then (4.9.6) gives 


DT\ _ TU dz* 
Dr V,A dr 


(4.9.9) 


However, we shall see in Chapter 6 that tensors defined along curves cannot always 
be promoted to tensor fields, and for these the derivative D\Dr is the only 
co variant derivative available. 

It is often the case that a vector A p ( r) carried along a curve by a particle does 
not change at r if viewed from a reference frame that is locally inertial at x(r). 
(This is true for a particle’s momentum and spin if it is subject to purely gravita- 
tional forces; see Section 5.1.) In this frame the affine connection as well as 
dA p /dr vanishes, so 

DA P 

= 0 (4.9.10) 

Dr 


This being a covariant statement, and true at x(r) in the locally inertial system 
£ y(t) , it is therefore true in all coordinate systems. The vector A p is then subject to 
the first-order differential equations 


dAfi 

dr 


= -n 


vA 



(4.9.11) 


that define A M for all r, given A M at some initial r. A vector A p (r) defined in this 
way along a curve x p (r) is said to be defined by parallel transport. Any tensor can 
be defined along a curve by parallel transport by requiring its covariant derivative 
along the curve to vanish. 


10 The Electromagnetic Analogy* 

I emphasized in Section 1 of this chapter that general covariance is not an 
ordinary symmetry principle like Lorentz invariance, but is rather a dynamical 


* This section lies somewhat out of the book’s main line of development, and may be omitted m a first 
reading. 
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principle that governs the effect of gravitational fields. As such, it bears a strong 
resemblance to another “dynamic symmetry,” local gauge invariance, which 
governs the effects of electromagnetic fields. Local gauge invariance says that the 
differential equations satisfied by a set of charged fields i j/(x) and the electro- 
magnetic potential A a (x) retain the same form when these fields are subjected to 
the transformations 6 


i H x ) - 

■> ip(x)e ievM 

(4.10.1) 

A A X ) - 

* + ^ 

(4.10.2) 


where e is the charge of the particle represented by i (/ and (p(x) is an arbitrary 
function of the space-time coordinates x a . How are we to construct gauge-invariant 
equations ? Note that derivatives of a charged field ijj do not behave under gauge 
transformations like i]/, but rather 



A Wr(*)e'** x) ] 


_ e ieq>(x) 


Ar~ + *#(*) 

_ dx a 


d(p(x) 

dx a 


just as derivatives of tensors do not behave like tensors under general coordinate 
transformations. It follows that an equation such as 


(□ 2 — m 2 )\l/(x) = 0 where D 2 = 

dx*dx p 


is not gauge-invariant, just as it is not generally covariant. Also note that the 
electromagnetic potential A (x) obeys an inhomogeneous gauge transformation law, 
just as the affine connection obeys the inhomogeneous transformation law (4.5.2) 
for general coordinate transformations. In tensor analysis we put together deriva- 
tives of tensors and the affine connection to form “co variant derivatives” that 
transform like tensors. In electrodynamics we put together derivatives of fields 
and the vector potential to form £ 4 gauge- co variant derivatives” 


&M x ) 


dx a 


ieA a (x) 






that transform like the fields themselves, 


(4.10.3) 


®M X ) - l@^(zW evM 


( 4 . 10 . 4 ) 


An equation that is invariant under gauge transformations with (p constant (such 
invariance is simply tantamount to charge conservation) will be invariant under the 
general gauge transformation (4.10.1 )-(4. 10.2) provided that it is constructed only 
out of fields \jf{x) and their gauge-covariant derivatives ^ a }p(x), just as an equation 
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that is invariant under Lorentz transformations will be invariant under general 
coordinate transformations provided that it is constructed out of tensors and 
their co variant derivatives. For instance, we can write a gauge-invariant equation 
that might represent the effect of electromagnetism on a charged scalar field ij/(x) as 

+ m 2 ]{J/(x) = 0 ( 4 . 10 . 5 ) 

or, in more detail, 

( n 2 — 2 ieA* — ie — e 2 A x A x + m 2 \l/(x) = 0 

\ dz* dx« y 

One important property of such theories is that they admit the construction of a 
conserved gauge-invariant current ; in this example we can define 

J,( x ) = -ie{^(x)3,\j/(x) - iji(x)\3Ji( x )V} 

(A dagger denotes complex conjugation, or in quantum theories the Hermitian 
adjoint.) That this is gauge-invariant is obvious; to see that it is conserved, we 
write 





d\ j/{x) ( d^{x) 
dx* y dx* 


4- ieA a (x)ij/*(x) 


) 


+ ^(x)^ + ieA’(x))@ a }/i(x) - ^(x)[(^“ + ieA*(x))&' i//(x)] f 


= ^(x)^^(x) - >l/(x)[$‘2>J(x ) ]t 


and using (4.10.5) this gives 


dx a 


J*{x) = 0 


We can thus use this current in the right-hand side of Maxwell’s equations (2.7.6), 
and these equations will then be gauge-invariant also. We see in Chapter 7 that the 
field equations for gravitation are constructed in an analogous manner. 

The analogy between the gauge invariance of electrodynamics and the general 
covariance of general relativity can be extended to a similar dynamic symmetry, 
called chirality, 3 that governs the interactions of pi-mesons. A proper explanation 
of this point would fill another book. 


11 p -Forms and Exterior Derivatives* 

Antisymmetric tensors and their antisymmetrized derivatives possess certain 
remarkably simple and useful properties, some of which we have already en- 

* This section lies somewhat out of the book’s main line of development, and may be omitted m a first 
reading. 
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countered in Section 4.7. In order to deal with these properties in a unified way, 
mathematicians have developed a general formalism, known as the theory of 
differential forms . 1 Unfortunately, the rather abstract and compact notation 
associated with this formalism has in recent years seriously impeded communication 
between pure mathematicians and physicists. This section presents some of the 
fundamental results of the theory of differential forms, but in the tensor notation 
familiar to physicists, rather than the recondite notation favored by mathe- 
maticians. 

A co variant tensor of rank p, which is antisymmetric under exchange of any 
pair of indices, will be called a p-form. In n dimensions, the number of alge- 
braically independent components of a ^>-form is just the binomial coefficient 


p \ (n — p ) ! 


(4.11.1) 


For instance, a scalar field is a 0-form, a covariant vector field is a 1-form, and an 
antisymmetric co variant tensor with two indices is a 2-form. 

Linear combinations of p-forms are p-forms. However, the direct product 
s nv ■ Jpa • • • °f a ^-form s . . and a g-form t . . . is not a (p + ^)-form, because 
it is not completely antisymmetric. We can form a {p + </)-form s a t by anti- 
symmetrizing the direct product : 

( s A -^ + , = Antisym .. .„ p+ ,} (4.11.2) 

where, in general, “Antisym” denotes an average over all permutations II of the 
indices, 

Antisym {«„ 1M . . = U *!*„„*„. - -, llm (4-11.3) 

lib • 11 


with a sign-factor <5 n which is + 1 or — 1 according to whether II consists of an even 
or odd number of permutations of individual index pairs : 


4-1 II even 

- 1 II odd 


(4.11.4) 


The antisymmetrized direct product (4.11.2) is known as the exterior product. 
For instance, the exterior product of a 0-form s and a 1-form t is simply the 
ordinary product 

{S A *)„ = ^ 

whereas the exterior product of a 1-form s M and a 1-form t v is the 2-form 

{S A «) M v = i(Vv ~ 

The reader can easily verify that the exterior product is associative, 


(s a t) a u = s a (t a u) 


(4.11.5) 
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and bilinear, 

(oqq + ol 2 s 2 ) a t = a l (s 1 a t) + oc 2 ( s 2 a 0 
s a (adi + oc 2 t 2 ) = a x (s a q) 4- a 2 (s a £ 2 ) (4.11.6) 

(Here oq and a 2 are scalars.) However, the exterior product is not commutative; 
rather, if s is a ^-form and t is a g-form, then 

(s A t) = (-l) M (f A s) (4.11.7) 

This is a good spot to pause, and mention that in books on the mathematical 
theory of differential forms, 7 a ^-form t is generally not represented by the tensor 
components t flv . . ., but rather by the “differential form” 

c 0 = t MV . . . (dx M a dx v a • • • ) 

The symbol dx ** here denotes a quantity that transforms like a coordinate 
differential, that is, as a contra variant vector, but whose products, unlike ordinary 
products of coordinate differentials, are associative and anticommutative : 

(dx 1 * a dx v ) a dx k = dx M a (dx v a dx x ) 
dx ** a dx v = —dx v a dx? 


The product oq a co 2 of differential forms co t and co 2 has tensor coefficients 
t flv . . . that are just given by the exterior product of the tensor coefficients of co 1 
and a> 2 . The associativity and commutativity rules (4.11.5) and (4.11.7) of the 
exterior product then follow trivially from the associative and anticommutative 
properties of the product dx ** a dx v . As already indicated, we shall not use this 
language here; for us, a jo-form will be simply an antisymmetrical tensor, not the 
corresponding differential form. 

The point of developing the theory of ^-forms separately from the rest of 
tensor analysis emerges when we study their derivatives. The partial derivative 
operator d/dxl 1 is a covariant vector, or in other words a 1-form, so given any p- 
form t we can define a (p + l)-form Dt, known as the exterior derivative of t, by 
simply taking the exterior product of djdx with t: 


Dt = 


a i 


dx 


or, in more detail, 


= Antis y m 



fl2' • ' flp + 


(4.11.8) 


(4.11.9) 


Tor instance, the exterior derivative of a O-form t is simply the ordinary gradient 


m. 


dt 

dx» 
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whereas the exterior derivative of a 1-form t is just the “curl” 


m 


HV 


1 __ dt JL. 

2\dx tl , dx v 


In three dimensions, the exterior derivative of a 2 -form t f . may be expressed as an 
ordinary divergence: 


{T>t)ij k — \&ijk 


^23 $31 ^12 

dx 1 dx 2 dx 3 


The first remarkable property of the exterior derivative is that, acting on a 
tensor p-form, it gives a tensor (p + l)-form. The easiest way to see this is just to 
note that the partial derivatives used to define the exterior derivative can be 
replaced with covariant derivatives, 

(«)„, ■ • = Antisym (4.11.10) 

because the contribution of the affine connections that appear in the covariant 
derivatives vanish upon antisymmetrization. Our previously derived results 
(4.7.1), (4.7.2), and (4.7.11) are special cases of Eq. (4.11.10) for p = 0, p = 1, and 
V = 2 . 

From the associativity and commutativity rules (4.11.5) and (4.11.7) for the 
exterior derivative, we can easily derive a simple formula for the exterior derivative 
of the exterior product of a p-form s and a g-form t : 

D(s a t) = Ds A t + ( — 1 ) pq Dt A s 

= Ds A t + (-1 ) p s A Dt (4.11.11) 

From the same rules, it also follows that repeated exterior derivatives vanish, 

D 2 t = — A (— Alhf-A-huO (4.11.12) 

dx \dx j \dx dx) 


This latter result is known as Poincare's lemma. Among the special cases of this 
lemma are two well-known results of three-dimensional vector analysis, that a 
gradient has zero curl and a curl has zero divergence. 

The question naturally arises whether the converse to Poincare’s lemma is also 
valid. That is, if s is a (p + l)-form for which 


Ds = 0 

then can we express s as 

s = Dt 


(4.11.13) 

(4.11.14) 


for some p-form p? The answer is yes, provided that the region m over which 
(4.11.13) holds, and over which we want (4.11.14) to hold, can be deformed to a 
point. In general, we say that a region 01 can be deformed to a point y ** if every 
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point x v- in can be connected with the point y ^ by a path A P (A; x) lying entirely 

in here A is a real parameter that can be taken to run from 0 to 1, with 

X p (0; x) = y* X p (l; x) = x? 


It is straightforward to verify that if (4.11.13) holds over such a region then 
(4.11.14) will be satisfied throughout this region by the jo-form 


V ■ •,.„(*) = (P + ! ) 


''dX^Xix) 8X v '(X\x) 
o 8X dx»' 


8X'-(X; x) 
dx>* * vv 


..., p (Z(A;*))«tt 


(4.11.15) 


The well-known results of three-dimensional vector analysis, that a vector may be 
expressed as a gradient if it has zero curl, or as a curl if it has zero divergence, may 
be regarded as special cases of this theorem for p = 0 and p = 1, respectively. 
Maxwell’s equations provide an example in four dimensions: The field -strength 
tensor F x p is a 2-form that according to Eq. (2.7.10) has vanishing exterior deriva- 
tive, so that it can be expressed as the exterior derivative of a 1-form, conven- 
tionally denoted —2 A a , 


F, 


8Aj _ 8_A . 

8x« dx? 


as in Eq. (2.7.11). In general, the jp-form t satisfying Eq. (4.11.14) is not unique; 
given one such t, the most general ^?-form satisfying (4.11.14) is of the form 

t' - t + Du (4.11.16) 


where u is an arbitrary (p — l)-form. For instance, if A % is one vector potential 
whose curl is F a p, then the most general such vector potential is given by the 
"gauge transformation” 


K 


= A x + 


dx a 


where O is an arbitrary 0-form, that is, an arbitrary scalar. 

Just as the exterior derivative provides a natural generalization of the 
familiar gradient, curl, and divergence, so also it is possible to construct a scalar 
integral of ^?-forms over manifolds of dimensionality p, which provides a natural 
generalization of the familiar volume integrals of scalar densities and surface 
integrals of normal components of vector densities. A manifold M of dimensionality 
p in an n - dimensional space is simply a region within which the n coordinates xA 
may be expressed in a smooth one-to-one way as functions of p parameters u l \ 

xF = ^(u\ w 2 ,.,.,u p ) (4.11.17) 

Actually, it is often impossible to cover the whole of a manifold with a single set 
of ^-coordinates ; in the general case, it is necessary to introduce different sets of u- 
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coordinates in different overlapping patches of the manifold, with the proviso that 
in the overlap between one patch with coordinates u l and another patch with 
coordinates u l , the u l can be expressed in a smooth one-to-one way as functions of 
the u l , and vice versa. We shall actually be concerned here with what are called 
orientable manifolds, for which the coordinates in each patch can be chosen so that 
in the overlap regions all determinants \dujdu\ are positive-definite. For instance, 
the surface of a sphere is orientable. For simplicity of notation, the discussion 
below does not take account of these complications, but it should be kept in mind 
that more than one set of ^-coordinates may be needed to cover the manifold. 
With this understanding, the integral of a p-form t over a manifold Jt of dimen- 
sionality p is defined as the repeated integral 




tdV p 


■■ -Up 


dx pi 
du 1 


du p 


(4.11.18) 


with limits of integration set by the boundaries of the manifold. 

This integral is obviously a scalar with respect to transformations of the x p 
coordinates used to define the p-form. It is also necessary to consider how the 
integral behaves if we decide to describe the manifold with a new set of parameters 
u 1 ' • • u p instead of u 1 • • • u p . Taking account of the antisymmetry of t, it is easy to 
see that in this case the integrand changes by a factor, the determinant \dujdu\, 
whereas the ^-dimensional volume element changes by a positive factor | \duldu\ |. 
Thus the whole integral either is unchanged or changes by a minus sign, according 
to whether the determinant \du/du\ is positive or negative. (We tacitly assume that 
the transformation u l — ► u l is nonsingular, so that this determinant cannot vanish, 
and therefore keeps the same sign throughout Jl.) This result shows, incidentally, 
that when several u - coordinate systems are needed to cover the manifold, the 
integral of a p-form over the overlap between tw o patches described by coordinates 
u l and u l can be evaluated using either coordinate system, provided that the 
determinant \dujdu\ is positive; it is for this reason that we have to restrict our 
attention to manifolds that are orientable. 

The simplest example of an integral of the form (4.11.18) is provided by the 
special case where p is equal to the dimensionality n of the x p coordinate space. 
Here the x p coordinates themselves may be used as the ^-coordinates, so that 
(4.11.18) becomes in this case 


tdV, 

M 


J ^12 ■ • -n dx 2 • • • dx n 


Note that the integrand t i2 . . in addition to being one component of a tensor, is 
also a scalar density of weight — 1 , because we can write it as 
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and £ pv ' ' ’ is a tensor density of weight — I. (See Section 4.4.) in the next simplest 
case, where p = n — 1, we can write (4.11.18) in the familiar form 


J 


tdV 


where t M is a vector density , defined by 


t p dS p 


• Up 


• fipn 




and dS ^ is a surface element oriented normal to the manifold : 


dS, 


8x“ l 

E "‘ IhC 


— r du x * ■ • du p 

du y 


With this general definition of the integrals of p -forms, it is possible to prove 
that the integral of the exterior derivative of a jp-form over a manifold of dimen- 
sionality (p -f 1) is simply the integral of the ^p-form itself over the p-dimensional 
boundary of the manifold : 7 


DtdV p+l = tdV v (4.11.19) 

J M J boundary of M 

(We shall not go into the problem of defining the orientation of the boundary, 
which is needed in order to specify the sign of the right-hand side). Stokes’s 
theorem and Gauss’s theorem are simply the special cases of this general formula 
for n = 3, p = 1 and n — 3, p — 2, respectively. 
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Some guide the course of wand’ring orbs on high, 
Or roll the planets through the boundless sky. 
Some less refined, beneath the moon’s pale light 
Pursue the stars that shoot athwart the night, 

Or suck the mists in grosser air below, 

Or dip their pinions in the painted bow 
Or blow fierce tempests on the wintry main, 

Or o’er the glebe distill the kindly rain. 

Alexander Pope, The Rape of the Lock 


5 EFFECTS OF 
GRAVITATION 


We now return to physics, and apply what we have learned in the last chapter 
to determine the effects of gravitation on the equations of mechanics and electro- 
dynamics. The technique to be used is that afforded by the Principle of General 
Covariance : We must first write the equations as they hold in special relativity, 
then decide how each quantity in the equations is to transform under general 
coordinate transformations, and then replace r] fiv with g and all derivatives with 
co variant derivatives. The resulting equations will be generally co variant and true 
in the absence of gravitation, and hence true in arbitrary gravitational fields, 
provided that the system in question is small enough compared with the scale of 
the fields. 


1 Particle Mechanics 

A particle not under the influence of any force will in special relativity have 
constant four- velocity U* and constant spin S a , that is, 


dU a 

dr 


0 


dr 


( 5 . 1 . 1 ) 


dQ 

dr 


* = 0 


( 5 . 1 . 2 ) 
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Recall that S a is defined in the rest frame of the particle to have the components 
{S, 0}, so that in an arbitrary Lorentz frame it satisfies the further relation 

S X U« = 0 (5.1.3) 


In order to make these equations generally covariant, we define vectors 
and Sp in a general coordinate system x * by 


C7" = 


dx M 


u = dx> 
f dr 


a 

^ = ^ ^ 


(5.1.4) 

(5.1.5) 


where U f* and S f a are components of U and S in the freely falling coordinate 
system £ a . Although 77^ and 8^ are vectors, dU^jdx and dSJdx are not, but we saw 
in Section 4.9 that there can be defined vector derivatives DU^/Dx and DSJDz, 
which reduce when F“ ; = 0 to the ordinary derivatives d U" jdx and dSJdx. The 
correct equations for the particle position and spin are dictated by the Principle of 
General Covariance to be 



^ = 0 ^ = 0 

Dr Dx 

(5.1.6) 

or, in more detail, 

— + r “ x u v u l = o 

dx 

(5.1.7) 


JO 

■J* - = 0 

dx 

(5.1.8) 

In addition, (5.1.3) becomes 

now 



o 

II 

a. 

& 

SQ 1 

(5.1.9) 


To repeat the reasoning of Section 4.1, these equations are true in the presence of 
gravitational fields because they are generally covariant and true in the absence of 
gravitation, reducing when F^ vanishes to Eqs. (5.1.1)-(5.1.3). That is, the 
Principle of Equivalence tells us that there are locally inertial coordinate systems 
in which (5.1 .6)— (5.1.9) are valid (provided always that our particle is sufficiently 
small), and general covariance then ensures that these equations hold in the 
laboratory reference frame. 

We recognize in Eqs. (5.1.7) and (5.1.8) the differential equations for parallel 
transport of the vectors U ** and S /t . Since U M = dx^jdx, Eq. (5.1.7) is nothing but 
the familiar equation for free fall, derived previously by differentiating (5.1.4) with 
respect to r and using (5.1.1); it should be evident that a good deal of work is 
saved by using general covariance instead of our previous direct approach. 
Equation (5.1.8) describes the precession of gyroscopes in free fall, and is further 
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discussed in Chapter 9. .For the present, we note that S 8* is constant, because the 
ordinary derivative of a scalar is the same as its covariant derivative 


f = 0 

dr Dr 


(5.1.10) 


If the particle is not in free fall, then DU*jDr does not vanish, and instead of 
(5.1.7) we have 


DU* _ f* 
T)t m 


(5.1.11) 


where m is the particle mass and f* is a contravariant force vector. This can also 
be written as 


m 


d 2 x * 
dr 1 


= r - 


dx v dx k 
dr dr 


The term containing mT*^ evidently plays the role of a gravitational force. We 
can always calculate f* if we know its value in freely falling frames for the 
requirement that f* behave as a vector gives it uniquely as 


1 ~ dt Jf 


(5.1.12) 


The electromagnetic force will be determined in the next section. 

It sometimes happens that a particle is acted on by a force/ M without experienc- 
ing any torque. In this event, an observer in a locally inertial coordinate frame 
that is momentarily at rest with respect to the particle will see no precession of 
the spin axis; that is, dSjdt will vanish. But in this particular coordinate system 
dx\dt also vanishes, so we may write the condition of zero torque as a Lorentz- 
invariant statement 


dS* 

dr 


oc U« 


and this will be valid in any locally inertial coordinate system, comoving or not. 
Now, what is the constant of proportionality ? Let us set 


d& 

dr 


= Ot/ £ 


We recall that S a is defined so that 


S X U X = 0 


0 = -t (8.V) = 9U.W + S' 

dr 


dU* 

dr 


and therefore 
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so 


® = 


dr 



m 


The spin vector therefore suffers a change given by 


dr 



U a 


(5.1.13) 


This phenomenon is known as the Thomas precession. 1 If we now turn on a gravita- 
tional field, Eq. (5.1.13) and the Principle of General Covariance tell us that the 
spin precesses according to the rule 


DS* 

Idt 



m 


DU V 

W = S v w 

V Dr 


(5.1.14) 


A vector obeying this differential equation is said to be defined by Fermi transport ; 2 
parallel transport is the special case for ff = 0. 


2 Electrodynamics 


We recall that in the absence of gravitational fields the Maxwell equations of 
electrodynamics can be written as 


(5.2.1) 

dx* 





(5.2.2) 


where is the current four- vector {J, e}, and F ap is the field-strength tensor, 
F 12 = B 3 , F 01 — E l} and so on. (See Section 2.7.) Suppose that we define F^ v 
and J u in general coordinates by the requirements that they reduce to F^ and 
J* in locally inertial Minkowskian coordinates, and that they behave as tensors 
under general coordinate transformations. (That is, if F*P and J a are the values 
measured in a locally inertial frame, then F ,fiy = (dx^ j’d^)(dx v ld^)F xp and J * = 
(dx» 18^)3 *.) We can then make (5.2.1) and (5.2.2) generally co variant by replacing 
all derivatives by covariant derivatives : 


F^ ;fl = - J v (5.2.3) 

P,,* + = 0 0-2.4) 

it being now understood that indices are to be raised and lowered with g x instead 
of tj ayJ that is, 




(5.2.5) 
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Since F^' and F^ v are antisymmetric, we may use (4.7.10) and (4.7.11) to rewrite 
the Maxwell equations as 


_d_ 

dx» 


vV-F'"' = -yJgJ' 


(5.2.6) 




F. 


+ ~ F 

dx v 


Afi 


dx? 


v* = 0 


(5.2.7) 


Equations (5.2.3)-(5.2.7) are true in the absence of gravitation and generally 
covariant and hence, according to the Principle of General Covariance, also true in 
arbitrary gravitational fields. 

The electromagnetic force on a particle of charge e is given in the absence of 
gravitation by Eq. (2.7.9) : 

/« = «jf™ (5.2.8) 

dr 


We immediately conclude that in general coordinates the electromagnetic force in 
an arbitrary gravitational field is 


where of course 


f^eF^-f 

ar 

F\ = g y3L F* 


(5.2.9) 


(5.2.10) 


Once again, we are using the Principle of General Covariance; Eq. (5.2.9) obviously 
reduces to (5.2.8) in locally inertial Minkowskian coordinates, and it is generally 
covariant, because is a vector (Section 5.1), dx v /dr is a vector, and F fl v is defined 
as a tensor; therefore (5.2.9) is true. 

It is instructive to evaluate the current vector J v . In special relativity it is 

jV(* - *>)*£ (5.2.11) 

the integral being taken along the trajectory of the nth particle. [See Eq. (2.6.5).] 
The four-dimensional delta function in a general coordinate system is defined by 


d*x$){x)d 4 (x - y) = <j}(y) 


(5.2.12) 


Since g 1/2 d A x is a scalar, g~ ll 2 5 A (x — y) must be a scalar, which of course reduces 
to the ordinary delta function in special relativity, where <7 = 1. (In some works 
it is this scalar that is defined as the delta function.) Thus the contravariant 
vector that reduces to J a in the absence of gravitation is 

J^x) = g~ 1 / 2 (x) £ e„ J* S*(x - x n ) 


(5.2.13) 
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Note that the conservation law dJ*ldx a = 0 of special relativity becomes in 
general relativity J = 0 or, using (4.7.7), 

— = 0 (5.2.14) 

dx“ 

The factor g~ 112 in (5.2.13) is just what is needed to cancel the g Xjl in (5.2.14), so 
that (5.2.14) still just expresses the constancy of the e n . 


3 Energy -Momentum Tensor 


The density and current of energy and momentum were united in Section 2.8 
into a symmetric tensor satisfying the conservation equations 


dT «(t 

dx« 


G p 


(5.3.1) 


where G ^ is the density of the external force J acting on the system. (For an isolated 
system, G ^ = 0.) Define T^ v and G v as contra variant tensors that reduce to the 
special relativistic and G“ in the absence of gravitation. Then the generally 
covariant equation that agrees with (5.3.1) in locally inertial systems is 


or, using (4.7.9), 


T = G v 


(5.3.2) 


A — < -Jg Tn = G' - (5.3.3) 

y/gd* 

The factor \J g is familiar from electrodynamics, and arises from the fact that the 
invariant volume is sj g d 4 x. In contrast, the second term on the right represents 
a gravitational force density. Just as we would expect, this force depends on the 
system on which it acts only through the energy-momentum tensor. 

For a system of point particles the special-relativistic energy-momentum 
tensor is given in Section 2.8 as 


T“ p = I 

n 


J- dx/S*(x - x n ) 
ax 


(5.3.4) 


the integral again being taken along the particle trajectory. Following precisely 
the same reasoning as we did for J a in the last section, we conclude that the 
contravariant tensor that agrees with (5.3.4) in the absence of gravitation is 


C dr ^ 

T*' = 9~ 1/2 I -f- dx^s*(x - *J 

n * dx 


(5.3.5) 
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For an electromagnetic field F ap the special-relativistic energy-momentum 
tensor was calculated in Section 2.8 as 


= F%F^ - irj ali F y5 F? d (5.3.6) 

It takes no effort to see that the contravariant tensor that agrees with (5.3.6) 
in the absence of gravitation is 

T* v - F^F™ - ig^F^F™ (5.3.7) 


For a system consisting of particles and radiation, the energy-momentum tensor 
is the sum of (5.3.5) and (5.3.7). 

Returning for a moment to the purely material energy-momentum tensor 
(5.3.5), we easily compute that 

ninQgl!2 = 1 £ i m n ^ Xn - 

n dr 

4/ 

the sum running over all particles in the volume of integration. This suggests that 
rptiOgi /2 - g re g ar( j e( j i n general as the spatial density of energy and momentum. 
In particular, we are tempted to define the energy, momentum, and angular 
momentum for an arbitrary system by 




T^g 112 d 3 x 


(5.3.8) 


J ^ = J (x fl T v0 - x v T fl0 )g i / 2 d 3 x (5.3.9) 

However, these quantities are not contravariant tensors and are not conserved, 
because T^ v g 1/2 is not conserved, that is, d{T* LV g ll2 )ldx v does not vanish, owing to 
the exchange of energy and momentum between matter and gravitation. 


4 Hydrodynamics and Hydrostatics 

In the absence of gravitation, the energy -momentum tensor of a perfect 
fluid is that given by (2.10.7) : 

T ttP = prf* + (p + p)U a U p (5.4.1) 

where 17“ is the fluid four- velocity, U° = (1 — v 2 ) -1/2 , U = \U°. The con- 
travariant tensor that reduces to (5.4.1) in the absence of gravitation is 

T ^ = pg^ + (p + p)?7 M *7 v (5.4.2) 

where U M is the local value of dx^fdr for a comoving fluid element. Note that p 
and p are always defined as the pressure and energy density measured by an 
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observer in a locally inertial frame that happens to be moving with the fluid at the 
instant of measurement, and are therefore scalars. The conditions of energy- 
momentum conservation give the hydrodynamic equations 


0 = T»\ 


~ + g 


1/2 V n l/2 


-g ' (p + p)u»u v 


+ tup + pW'U’- 


(5.4.3) 


The last term represents the gravitational force on the system. Note also that 
since r \ = — 1 in the absence of gravitation, we must in the presence of 
gravitation have 

= -1 (5.4.4) 

Consider as an example the case of a fluid in hydrostatic equilibrium. Since 
it is not moving, (5.4.4) gives 


U° = (-g 0 ( 


U x = 0 for X ^ 0 


Furthermore, all temporal derivatives of g^ v , p, or p vanish. In particular, 

r °° “ ^ w 


V(P + P)UW] = 0 
dx v 


Multiplying (5.4.3) by g ^ gives then 


VP f P) m \ -y 00) 


This is trivial for X = 0, whereas for X spacelike it is nothing but the ordinary 
nonrelativistic equation of hydrostatic equilibrium, except that p + p appears 
instead of the mass density, and In ( — g 00 ) 1/2 appears instead of the gravitational 
potential. This equation is soluble if p is given as a function of p. We then find 
that 


dpip) . / 

— _ — = —in v —goo + constant 
P(P) + P 


(5.4.6) 


For instance, if p{p) is given by a power law: 


P(P) oc p N 


then (5.4.6) gives for N ^ 1 


P ——?. gc ( — <7 00 ) “-">«» 
P 


(5.4.8) 
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and for A = 1 

P oc (~g 0 0 )- ( ' + rt«' (5.4.9) 

This, incidentally, shows that gravitation can never produce hydrostatic equilib- 
rium in a finite highly relativistic fluid t with p — p/3, for then (5.4.9) gives 

P oc ( — J7 00 )~ 2 (5.4.10) 

Since p must vanish outside the fluid, g 00 would have to become singular at its 
surface. 
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“When I behold, upon the 
night’s starr’d face, huge 
cloudy symbols ...” 

John Keats, When I Have 
Fears That I May Cease 
to Be 


6 CURVATURE 


We are going to work out the gravitational field equations by applying the 
Principle of Equivalence to gravitation itself. As in the last chapter, it is most 
convenient to apply this principle by looking for field equations that are generally 
covariant and that reduce to the proper form for weak fields. Thus we must address 
ourselves to the question : What tensors can be formed from the metric tensor and 
its derivatives ? In this chapter we treat this as a purely mathematical problem, 
as in fact it was treated by Gauss and Riemann; the information we compile 
here will then be used in the next chapter to guide us in our search for the field 
equations of gravitation. 


1 Definition of the Curvature Tensor 

We want to construct a tensor out of the metric tensor and its derivatives. 
If we use only g and its first derivatives, then no new tensor can be constructed, 
for at any point we can find a coordinate system in which the first derivatives of 
the metric tensor vanish, so in this coordinate system the desired tensor must be 
equal to one of those that can be constructed out of the metric tensor alone , 
(e.g., g or g ^ or y J g 5 and so on), and since this is an equality between tensors 

it must be true in all coordinate systems. 
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The next simplest possibility is to construct a tensor out of the metric tensor 
and its first and second derivatives. To do this, let us recall the transformation rule 
for the affine connection : 


rA _ dx/ 1 / dx/// r , t dx*_ d 2 x ,x 

pv dx n dx p dx v P<T dx' x dx p dx' 


( 6 . 1 . 1 ) 


(This is Eq. (4.5.2), with primed and unprimed coordinates interchanged.) It is 
the inhomogeneous term on the right that keeps F A V from being a tensor, so let 
us isolate this term : 

d 2 x ,x 


dx 11 dx v 


_Sx^dx^ 

dx 2 pv dx p dx v P * 


( 6 . 1 . 2 ) 


To get rid of the left-hand side, we use the commutativity of partial differentiation. 
Differentiation with respect to x K gives 

d 3 x /x 


dx K dx * dx' 


= r A (¥1 

" v \dx" 


r , 8x'“ dx "• 

* X ox* dx * ' 


r ,. Sx’" I 

'dx?* 

K dx« Kv 

dx ’" dx' ( 

"" dx" \ 

dx K dx v 

r't a*"/ 

^X' p 

K dx* Kp 

dx'" dx' ( 

dx" \ 

dx K dx" 

8x n dr* v 

dx ,p dx ,a 

ex'" dr_i„ 

dx* dx K 

dx p dx v 

dx K dx'" 


-) 




or, collecting similar terms and juggling indices a bit, 


d 3 x ,x dx ,x /dF 2 


dx K dx fl dx 1 


/l v 4 . r 

i x flV ± 


* ) 


dx 2 \dx K 

dx ,p dx ,a dx "> /dr 
dx * dx v dx K 
dx’ 


a y/r y/A 

X pk X TJ(T 


A 

tip I 


- n: 


dx' 


pi , r * 8x ’ < ‘ , ri 8x ' P 


+ T' 1 

flV ^ v 1 1 Itv 


dx K 


dx p 


+ n 


dx' 


(6.1.3) 


Now subtracting the same equation with v and k interchanged, we find that all 
terms involving products of F with F' drop out, leaving 


0 


8^(dTl v 

dx* I dx K 


dr 2 

(IK , r-f/ W 

dx v pv Ktt 


dx ,p dx ,a dx ,rt (dr 


dr ,x 

pn 

dx * dx v dx K \dx'” dx ,a 
This may be written as a transformation rule, 


r 71 r A 

± flK X VTf 


Wi + v. 


tf'a\ 

Lf / 1 op I 


R' 


dx n dx* dx v dx K Rk 
dx* dx'" did dx 7 * 


poll 


(6.1.4) 
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where 


R> 


V* fiv 

dx K 


erf 

dx' 


, p i y * 

1 *"■ /JV* Ktf 


- n t * 

A flK * Vtf 


(6.1.5) 


Equation (6.1.4) says that i?* is a tensor; it is called the Riemann-Christoffel 
curvature tensor. 

The existence of the tensor R * raises once again the question of whether or 
not the Principle of Equivalence or the Principle of General Covariance uniquely 
determines the effects of gravitation on arbitrary physical systems. For instance, 
let us ask whether the correct equation of motion for a freely falling particle of 
spin S might be of the form 


d 2 x * A dx " dx x 
lx 2 + " v dr Tr 




dx 11 dx v aK 

S K 

dr dr 


(6.1.6) 


(with / an unknown scalar) instead of the familiar form 

__ d 2 x * A dx ** dx v 

dr 2 /lv dr dr 


(6.1.7) 


Both Eqs. (6.1.6) and (6.1.7) are generally covariant, and both reduce in the 
absence of gravitation to the correct special-relativistic equation dU*jdr = 0. 
How then can we tell whether (6.1.6) or (6.1.7) is correct ? 

The answer again is one of scale. Suppose that our particle has a characteristic 
linear dimension d , and that the gravitational field has a characteristic space-time 
dimension D. The Riemann-Christoffel tensor has one more derivative of the 
metric than the affine connection, so the ratio of the third term in (6.1.6) to the 
second term is proportional to l/D ; dimensional considerations then require that 
this ratio be roughly of order d/D. Thus, barring special circumstances that might 
make one term or the other anomalously large or small, we can regard the last 
•term in (6.1.6) as being negligible if our particle is very much smaller than the 
characteristic dimensions of the gravitational field, and (6.1.7) is the correct 
equation of motion. Of course, if our particle is not much smaller than the gravita- 
tional field scale (as in the case of the moon moving in the gravitational field of the 
earth), then the Principle of Equivalence or the Principle of General Covariance 
must be applied to the infinitesimal elements of which the particle is composed, 
although (6.1.6) or (6.1.7) might give a fair phenomenological representation of the 
motion of the whole particle. 


2 Uniqueness of the Curvature Tensor 

We next prove that ii^ VK is the only tensor that can be constructed from the 
metric tensor and its first and second derivatives, and is linear in the second 
derivatives. 
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For this purpose it proves extremely convenient to fix our attention on a 
particular point X, and adopt a locally inertial coordinate system, in which at this 
point the affine connection vanishes. Furthermore, we consider only the limited 
class of coordinate transformations that leave the affine connection zero ; according 
to Eq. (6.1.1), these are simply the transformations x -> x’ with 



(6.2.1) 


Any quantity that transforms as a tensor under general coordinate transformations 
will have to transform as a tensor under this limited class of transformations, and 
this requirement proves sufficiently strong for our purposes. 

Since the affine connection vanishes at X, all first derivatives of the metric 
tensor vanish at X [see Eq. (3.3.5)] and our desired new tensor must be just a 
linear combination of the second derivatives of the metric tensor, or equivalently, 
of the first derivatives of the affine connection. We see from Eq. (6.1.3) that for 
T k v and Y'p„ zero, the derivatives of the affine connection obey the transformation 
rule 

dr p l _ dx^cx^dx^dx^ dT k v 
dx >n dx ,p dx' a dx' n dx k dx K 


dx p dx y dx K aV T 

at x = X 

dx fp dx ,a dx ,ri dx K dx p dx v 


(6.2.2) 


What linear combination of dTjdx can we take that will behave like a tensor? 
Clearly, it must be such as to eliminate the inhomogeneous term in this trans- 
formation rule. However, at any given point X the inhomogeneous term is a 
completely arbitrary function of its indices p, er, rp subject only to the condition 
that it be symmetric in these indices. Hence the only way of taking a linear 
combination of dTjdx that will transform as a tensor under all transformations 
x -> x r satisfying (6.2.1) is to antisymmetrize in k and v (or equivalently in k and 
p), so that (6.2.2) becomes 


mrc 

J pat} 


where at x — X 


dx^dx v dx^dx^ k 
8x'> 8x r ° 8x'" dx x ' l ' ,K 


8T X 8T X 

= V V v _ Ptc 

dx K dx v 


X 


(6.2.3) 


That is, the desired tensor must be a T^ VJC given by (6.2.3) whenT vanishes. But 
when r = 0 the Riemann-Christoffel tensor satisfies (6.2.3), so in locally inertial 
systems T k tlVK = i2 A MVK . But this is an equality between tensors, so since it is true 
in one class of coordinate systems it is true in all coordinate systems ; that is, the 
only tensor T of the form desired is just -tt k pxK - 
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3 Round Trips by Parallel Transport 


Both for its own sake and as preparation for the next section, we now take up 
the question whether a vector S , when carried along a closed curve C by the 
equation of parallel transport (see Sections 4.9 and 5.1) 


dSp _ A dx x 
hr ~ MV hr 


(6.3.1) 


will come back to its original value after one complete circuit of the curve. 

We can answer this question by applying the method used in the familiar 
proof of Stokes’s theorem. Consider the curve C as the edge of some two-dimensional 
surface A, and divide A into small cells bounded by little closed curves C N . The 
change in 8^ when parallel-transported around G can be written as the sum of the 
changes in S when transported around each of these little curves. 

= £ A (6.3.2) 

N 

because the change in around any one interior cell is canceled by the changes 
around adjacent cells, leaving only the contribution from the outer cell edges that 
make up C. Therefore we must ask only whether # changes when parallel-trans- 
ported around a small closed curve. If the curve is small enough, we can expand 
r; v («) around some point X = x(r 0 ) on the curve 

r;» = rJ v (X) + (s' - x ") r*,(X) + ■■■ (6.3.3) 

oX y 

Then (6.3.1) gives to first order in — X ** 

S„(t) = 8„( T 0 ) + r; v (X)(*’(r) - *>S a (t 0 ) + • • • (0.3/4) 

and by using (6.3.3) and (6.3.4) in (6.3.1) we obtain an equation valid to second 
order, 

S„(r) * S f ( r 0 ) + 1 WtiZ) + - X ") ~ K V (X) + ■■■ 

Jto L 

X [<S a (t 0 ) + S„(r 0 )r^(X)(^(r) -X') + - ■ ■] dr 

dr 
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or discarding terms of third or higher order in x — X, 

dx 1 


S„(T) =! S„(t 0 ) + r^ v (X)S A (r 

8 


J TO 


dr 


dr 


+ <-W) + r: p (X)ri v (X) j ^(r 0 ) 


d-r v 

(x p - X p ) — dr 
dr 


If x p (r) returns to its original value X p at some r — r l5 then obviously 

' T1 dx v 


f 


dr 


dr ~ 0 


so the change in S p when parallel-transported around the small closed curve 
x p (r) is of second order: 


AS, ^ Sfa) - fl„(r 0 ) 

a 


dX p 


TUX) + r; v (X)rj p m^ ^(r 0 ) <bx p dx 


(6.3.5) 


where 


(j ^)X P dx v — j* 


’Tl 

s' dx v = I x p — — dr 

dr 


This integral does not generally vanish; for instance, if our curve is a small 
parallelogram with edges 3a p , 3b p , it equals 


(hx p dx v = Sa p 3b v - Sa v 3b p 


However, it is always antisymmetric in p and v, as can be seen by partial integration : 


tyx p dx v — 


dr 


* 

J 1 


— ( x p x v ) dr — x v — dr — — (T>£ v dx p (6.3.6) 


dr 


Thus the coefficient of this integral in (6.3.5) can be replaced by its antisymmetric 
part, which is just half the curvature tensor (6.1.5), so 


= W»vp S o (p® p dx 


(6.3.7) 


Our conclusion is that an arbitrary vector S p will not change when parallel-trans- 
ported around an arbitrary small closed curve at X , if and only if B a flvp vanishes at X. 
We have already remarked that the change in 8 p when parallel-transported about 
a finite closed curve C may be computed by breaking up the area A bounded by C 
into small cells and then adding up the changes in S when parallel- transported 
around the edges of these cells; hence, if R a pV() vanishes throughout A, then an 
arbitrary vector S will not change when parallel-transported around C. 
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Row suppose that R a tlvp does vanish. Consider a closed curve consisting of 
two segments A and B joining points x * and The change in a vector S p when 
parallel-transported from a: to I along A must be canceled by the change in 8 p 
when parallel-transported along B from X to x, that is, 

A UJ3, + A = 0 

But the change in 8 when parallel-transported from x to X along B is minus the 
change when parallel-transported from I to a; along B : 

Af-.x'Sy = -Af^ST, 

and therefore 

= Al^ (6.3.8) 

That is, we get the same value of 8 p by parallel transportation from X to x, 
irrespective of which curve we follow. (For instance, if two gyroscopes are placed 
in different intersecting orbits about the earth, and have the same orientation 
when they pass close to each other at X fl , then any difference in their orientations 
when they next pass close to each other at x ** will be a measure of some average 
of the curvature produced by the earth’s gravitational field.) 

It follows that, given S p at X, we may determine afield Sfix), defined through- 
out the space-time region where B a flvp vanishes, by parallel transport from I toa:; 
Eq. (6.3.8) ensures that the S p (x) so defined will depend only on x, and not on the 
path from X to x. For this field the derivative along any curve x(r) is 

dS p _ dS p dx v (r) 
dr dx v dr 


and since the direction of dx v (r)jdr is arbitrary, JDq. (6.3.1) becomes 


or, in other words, 


dSjt 

dx' 


^ = r A Si 

x nv^X 


^;v = 0 


(6.3.9) 

(6.3.10) 


Hence, if the curvature tensor vanishes, we may always construct solutions of 
Eq. (6.3.9), with any given value of SfiX), by parallel transport of 8 from X to x. 
Conversely, if' there exists any covariant vector field with vanishing co variant 
derivatives, then (6.3.1) will certainly be satisfied, and since parallel transport 
cannot change a field when carried about any closed ‘curve, we conclude from 
(6.3.7) that 

*%vA = 0 (6.3.11) 


throughout the region where S„ satisfies (6.3.10). 
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(These conclusions could have been obtained by using well-known results 
from the theory of partial differential equations 1 instead of the method of parallel 
displacement. In this approach Eq. (6.3.11) appears as the necessary and sufficient 
condition that Eq. (6.3.9) can be solved by a power series expansion in x ** — X M .) 


4 Gravitation versus Curvilinear Coordinates 


Suppose that we are presented with a metric tensor g^x) that is not just a 
constant. How can we tell if the space is really permeated by a gravitational field, 
or if g MV merely represents the metric r} x p of special relativity written in curvilinear 
coordinates ? In other words, how can we tell whether there is a set of Minkowskian 
coordinates { X (x ) that everywhere satisfy the conditions 


d^(x)d^{x) 
n y dx * dx v 


(6.4.1) 


Note that the equivalence principle only says that at every point X we can find 
locally inertial coordinates £ x (x) that satisfy (6.4.1 ) in an infinitesimal neighborhood 
of X ; what we are asking now is whether we can find one set of coordinates { *(x) 
that satisfy Eq. (6.4.1) everywhere. For example, given the metric coefficients 

9„ = 1, 9n = r 2 , g„ = r 2 sin 2 0, g„ = -1 (6.4.2) 

we know that there is a set of satisfying (6.4.1), that is, 

^ — r sin 6 cos cp , £ 2 = r sin 6 sin (p, £ 3 = r cos 6 , £ 4 = t 

(6.4.3) 

but how could we have told that (6.4.2) was really equivalent to the Minkowski 
metric q xf} , if we weren’t clever enough to have recognized it as simply t] x p in 
spherical polar coordinates? Or, on the other hand, if we change g rr in (6.4.2) to 
an arbitrary function of r, how can we tell that this really represents a gravitational 
field, that is, how can we tell that Eqs. (6.4.1) now have no solution? 

The answer is contained in the following theorem : The necessary and sufficient 
conditions for a metric g^x) to be equivalent to the Minkowski metric rj^p [in the 
sense that there is a transformation x -> £ satisfying (6.4.1)] are, first, that the 
curvature tensor calculated from g^ v must everywhere vanish, 

R\ vk = 0 (6.4.4) 

and, second, that at some point X the matrix g flv (X) has three positive and one 
negative eigenvalues. 

The necessity of these two conditions is obvious. Suppose that we can find a 
coordinate system £ a ( x ) satisfying (6.4.1). In this coordinate system the metric 
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is all components of the affine connection vanish, and hence the Riemann 
tensor R a Pyd vanishes. But the vanishing of a tensor is an invariant statement, 
so must have vanished in the original x * coordinate system. Also, we have 

already noted in Section 3.6 that a “congruence” like Eq. (6.4.1) requires 
and g to have the same numbers of positive, negative, and zero eigenvalues 
everywhere. 

To prove the sufficiency of Eq. (6.4.4) for the existence of a system of “every- 
where inertial” coordinates £ a (x) satisfying (6.4.1), we shall actually construct 
the t^ix). First we note that at any point X we may find a matrix for which 

n *K = g“'(X) d\ df, (6.4.5) 

(For, since g uv {X) is a symmetric matrix, we can find an orthogonal matrix O’* ^ 
for which the matrix 0g0 T is diagonal, that is, for which 

0\g^0\ = D ap 
(0 a ^ p 

We are assuming that three of the eigenvalues D a are positive and one negative, 
and can always label the rows of 0*^ so that it is D l , D 2 , and D 3 that are positive 
and D° that is negative. Then to satisfy (6.4.5) we need only choose d l ^ = D l J V D l 
for i = 1,2, 3, and d° ^ = D°J\! — D°.) Next, we define quantities D^^x) by the 
differential equations 

Vv = ( 6 - 4 - 6 ) 

ox y 

with the initial condition 

D\ = d\ at x = X (6.4.7) 

We showed in the last section that such equations can always be solved, providing 
that JR* vanishes. (The quantities are to be thought of as four co variant 
vectors D ° fl , D 1 ^ D 2 fl> D 3 fl rather than as a single tensor.) Since dD*Jdx y is sym- 
metric in ji and v, we can write the vectors as gradients of scalars, which we 
define to be the locally inertial coordinates £*( x ) : 

£ - D ■■ 16481 

with initial values J a (X) some arbitrary constants. To see that these £ coordinates 
do satisfy (6.4.1), note first that 


-h (s^y^) = o 


(6.4.9) 
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This can be verified by direct calculation or, more simply, by noting that (6.4.6) 
just says that Z) a u . p vanishes, and g** v . p also vanishes, so (g llv D ei ll D p v ) vanishes; 
but since g flv D a fl D p v is a scalar, this means that its ordinary derivative vanishes. 
But (6.4.7) and (6.4.5) show that g tlv D a ll D^ v equals at x — X, so since it is a 
constant this holds everywhere : 

if* = g^D\D\ (all x) (6.4.10) 

Equation (6.4.1) follows immediately from (6.4.8) and (6.4.10). 


5 Commutation of Covariant Derivatives 

There is another way to see that the tensor i^ VK expresses the presence or 
absence of a true gravitational field. Consider the second covariant derivative of a 
covariant vector V k : 


v = — _ v — r ; v — r x v 

%;v;k ^ „;v V* „;A x /ik ' A;i 


JIIm. - v ir 1 

dx'dof 8x < l dx* 


dVj, 

” dx l 


r A ^ f » a - r x v 

x ~ x vk x fiX r a 


pA 3V \ f j-cr 


HK 


dx v 


+ r A n v 

■ A fiK x Av ' a 


The terms involving first and second derivatives of V are symmetric in v and k, 
but the terms involving V p itself contain an antisymmetric part, 


v _ V — _ V Tt a 

r h',v;k r h;k;v r it fivtc 


In the same way, we could show that 


yk _ yX _ y« R X 
r ;v:k r :k;v r 11 ctvk 


Similar formulas hold for any tensor ; for instance, 


rpX rpX _ rpa r>A mX n<r 

J fi;v;K - 1 h;k;v x fi lx <txk x a ^ hvk 


(6.5.1) 


(6.5.2) 


(6.5.3) 


Thus, if the curvature tensor vanishes, then covariant derivatives commute, 
as would be expected for a coordinate system that can be transformed into a 
Minkowski coordinate system. 
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6 Algebraic Properties of li ktlVK 

The algebraic properties of the curvature tensor, are greatly clarified if we 
consider, instead of A! a mvk , its fully co variant form 


AflVK Q kef flVK 


Using (6.1.5) and (3.3.7), this is 


We use the relation 


R = 1 a A_ a c„ fe* 8g pv dg„ v 

z9ic da* 9 )&; v + da* dx" 


Ln _%<TP \%Ui + 

8x v 1 dx K dx" 8x " 


-i- n. ir 1 * y° - r & 1 

' <7A.(T^ flV~ Kt] ~ ftK~ VtJJ 


9x„ “ tf* = 9x, 

dx K dx K 


= -g" p (r i x g„, + ri,g. x : 


(6.6.1) 


r - 1 r 82g *» 82g »' 82g ^ 

lp '” c 2 \_8x K dx" 8x K 8x x 8x' dx" 

- [r 1x9^ + r^,jr; v 

+ 1^1x0 w + 

+ axXnJ'U ~ r^r:,] 

Most of the IT terms cancel, leaving us with 


JB,. = - 


1 ryVjv 

2 _dx K dx 


'Ay _ ^ iffiv _ ^ ^ V[IK 

dx 11 8x K dx k dx v dx* dx v dx J 


+ gji TlxKx - nxT’ \ 


( 6 . 6 . 2 ) 


From (6.6.2) we may read off the algebraic properties of the curvature tensor: 

(A) Symmetry: 

R XfivK = -^vkAh (6.6.3) 

(B) Antisymmetry: 


(C) Cyclicity: 


^ kllVK -^ukxK -^AuKV ~bR„^ KV 


-^XfivK ^Ak/iv “1“ ^AvKJ* ^ 


(6.6.4) 


(6.6.5) 
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We have already mentioned that ma y be contracted to give the Ricci 

tensor 

( 6 . 6 . 6 ) 

The symmetry property (A) shows that the Ricci tensor is symmetric, 


R,k = K, (6.6.7) 

and the antisymmetry property (B) tells us that R^ k is essentially the only second- 
rank tensor that can be formed from , since multiplying (6.6.4) with g Xv , 
g Xfl , and g VK gives 

V = -/’JW = 


= o 

From the antisymmetry property (B) we also see that there is essentially only 
one way of contracting jR A/iVK . to construct a scalar: 


B = g u g^R^ K = 

o = g x Y KR x^ 


Finally, (C) eliminates the one other scalar that might have been formed in four 
dimensions, that is, 


V? 


e X»v K r 


kflVK 


= 0 


7 Description of Curvature in N Dimensions* 

For the moment let us consider a general space of N dimensions. To count 
the number of algebraically independent components of R ktlVK ^ it is convenient to 
adopt what may be called the Petrov notation, 2 and think of i£ A/iVK as a matrix 
R(Xp)( v *) w ith “indices” (Xu) and (vk). From (6.6.4) we see that each “index” takes a 
number of independent values equal to the number of independent elements of an 
antisymmetric matrix in N dimensions, or ^N(N — 1). From (6.6.3) we see that 
R(Xr)(vk) i s symmetric in these “indices,” so (6.6.3) and (6.6.4) alone would leave 
R kflvK with a number of independent components equal to the number of in- 
dependent elements of a symmetric matrix in jN(N ~ 1) dimensions, or 

iliN(N - 1)MA(A - 1) + 1] = i N(N - 1 )(N 2 — N + 2) 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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Equations (6.6.3) and (6.6.4) also make the cyclic sum R^ VK + Rxkhv + R^vkh 
completely antisymmetric, so Eq. (6.6.5) adds N(N — 1 )(N — 2 )(N — 3)/4! 
further constraints, leaving R k(lvK with a number of independent components 
equal to 

C N = i N(N - 1 )(N 2 - N + 2) - &N(N - 1 ){N - 2 )(N - 3) 
or combining terms 

C N = -hN 2 (N 2 - 1) (6.7.1) 

In one dimension the curvature tensor i2 nil always vanishes, as can be seen 
from (6.6.4) or (6.6.5) or from the fact that (6.7.1) gives C\ = 0 independent 
components. It may strike the reader as odd that a curved line should have zero 
curvature, but this just emphasizes that reflects only the inner properties 

of the space, not how it is embedded in a higher dimensional space. Indeed, we 
note that the transformation rule for the metric tensor in one dimension is 



so that g[ t can be made equal to + 1 everywhere by choosing 



In two dimensions (6.7.1) gives only one independent component, which 
can be taken as i? 1212 > the other components are related to i? 1212 by Eq. (6.6.4): 

^1212 = “-^2112 = "^1221 = ^2121 
111 “ -^1 122 ~ ^2211 ~ 7^2222 = 0 
These formulas can be summarized more elegantly by 


T^A/ivK (9av9hk 9xk9^v) 


R 


1212 


where g is the determinant 9u9 2 i — ^ 12 * Contracting X with v gives the Ricci 
tensor 

R , 

“UK ^ UK 


n „ ^*'1212 

R.~ = g u 


9 

and contracting p and k gives the curvature scalar 

2R< 


R = 


1212 

9 


so the curvature tensor is 


(6.7.2) 


(6.7.3) 


Tf'A/nvK ?R(9Xv9hk 9xk9hv) 


( 6 . 7 . 4 ) 
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The Gaussian curvature K discussed in the first section of this book is defined by 


K = ~ 


E 

2 


^1212 

9 


(6.7.5) 


(The factor — + is of purely historical interest.) Equation (1.1.12) follows from 
(6.6.2) and (6.7.5). 

In three dimensions (6.7.1) gives the curvature tensor C 3 = 6 independent 
components. This is also the number of independent components of the Ricci 
tensor E in three dimensions, so we may anticipate that here R XflvK may be 
expressed in terms of R alone. By using the covariance, symmetry, and con- 
traction properties of E AtlvK , we may further guess that this relation is 

■^kptVK 9 Xk^(IV 9/iV^XK 9 /XK-^Xv 

- i (flww - 9i*g**)R (6-7.6) 


To prove that (6.7.6) is correct, let us adopt a coordinate system such that g 
vanishes for fi v at some point X. (This can be managed by choosing dx ,tl ldx k 
at X as the orthogonal matrix that diagonalizes g at X.) In this system we have 
at X 

^12 == S f33 -^1323 


so 


^ 1 32 3 — 9 33-^12 


in agreement with (6.7.6). Furthermore, 

-^n = 9 22 B 1212 + S r 33 ^i 3 i 3 

-^22 = 9 33 ^2323 + y 11 R'2 121 
so 


922^11 + 9 1 1^22 — 2i?i2i2 ■+■ 9 32 ( 922^1313 + ^11^2323) 

= E l 212 + 9ll922(9 1 V 22 ^1212 + 9 1 V 3 3 -^l 3 1 3 

+ 9 2 y^2323) 

or 

^1212 = 92 2^11 + 9 11-^22 — \ 9 \\ 922 ^ 

again in agreement with (6,7.6). The other independent components of R^ VK are 
E i223> -^1213) ^ 2323 ’ an( l -^3i3i’ which can be obtained from E 132 3 and E 1212 by 
permuting the values 1, 2, 3; so (6.7.6) holds for these components as well. Since 
(6.7.6) thus holds in a coordinate system that is orthogonal at X, and is manifestly 
co variant, it holds in general. 

It is only in four or more dimensions that the full Riemann-Christoffel tensor 
E^ vk is needed to describe the curvature of a space. For instance, in four dimensions 
(6.7.1) gives the curvature tensor C 4 — 20 independent components, whereas 
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R^ k has only 10 independent components, so R k(lVK has 10 components beyond 
those which can be expressed in terms of R^ K . 

The Yj N 2 (N 2 — 1) components of R kflvK describe the curvature of a general 
N - dimensional space, but they do not do so in an invariant manner, for the values 
of these components depend not only on the intrinsic properties of the space but 
also on the particular coordinate system chosen. The invariant characterization 
of a curved space must be in terms of scalars constructed from R kflvK and g^ v . 
Let us count how many such scalars there are. The N* quantities dx'^ldx* can for 
a general coordinate transformation x x f be made anything we like at a given 
point X. Hence the yjiV^iV 2 — 1) independent components of R kllVK and the 
\ N(N -f 1 ) independent components of g ^ at this point may by general coordinate 
transformation be subjected to N 2 algebraic conditions; the number of scalars 
that can be constructed from R k ^ VK and g /xv is therefore 

■&N 2 (N 2 - 1) + tiv(iv + I) - N 2 = 1 %N(N - l)(iv - 2) (IV + 3) 

(6.7.7) 

The case N = 2 is an exception to this argument, because in two dimensions 
there is a one -parameter subgroup of coordinate transformations that has no 
effect on g^ v and i? A/[iVJC ; the correct number of invariants here is not zero but 
one, that is, the curvature scalar R itself. This exception does not occur for higher 
dimensional spaces, so (6.7.7) holds for N > 3. For N = 3, Eq. (6.7.7) tells us 
that there are three curvature scalars, which can conveniently be chosen as the 
three roots of the secular equation 


Det (i? BV - Xg^) = 0 


or equivalently as the three quantities 


R R^ v 


Det R 
Det g 


For N — 4, Eq. (6.7.7) tells us that there ar e fourteen curvature scalars. To enu- 
merate them (and for other purposes as well) it is convenient to decompose R kfiVK 
into terms that depend only on the Ricci tensor R^ v plus a term C k/XVK that has no 
nontrivial contractions. In N > 3 dimensions this decomposition is 


R^hvk — 


N 


teA* - + ?,„Av) 


R 

(. N - 1)(N - 2) 


tovSW - 9^c9„.) + C 


k flVK 


The tensor C kflvK is called the Weyl tensor 3 or the conformal tensor . (The latter 
name is used because the necessary and sufficient condition for the existence of a 
coordinate system in which g^ v is proportional to a constant matrix throughout 
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the space, is that vanish everywhere. 4 ) This tensor has the same algebraic 

properties as R XfIVK , and in addition it satisfies the ±N(N + 1) conditions 

0U = 0 

so the number of its linearly independent components is 

■&N 2 (N 2 - 1) - iN(N + 1) = ^N(N + 1 )(N + 2)(A - 3) 

[Equation (6.7.6) simply says that C kllvK = 0 for N = 3.] Barring degeneracies, 
the curvature invariants can be described as consisting of all the components of the 
Weyl tensor for that unique choice of coordinate axes that makes R and g 
diagonal, the elements of g ^ being + l’s, — l’s, and 0’s, plus the N eigenvalues of 
R fJlv . However, this enumeration breaks down when some of the eigenvalues of 
R are degenerate. A particularly interesting case is that for R = 0, which we 
shall see in the next chapter describes physical gravitational fields in empty space. 
In this case the curvature invariants for N = 4 are the 10 vanishing components of 
R flv (the vanishing of a tensor is an invariant statement) plus the four quantities 

p A/x ppOVKfl 

rikpvnri pa~' ’“'Aixvic 

° W^vit ]=- 

> Jg 

ft flvKpa zZfl A/x 
ri rivKpari kp ^Ajuvk w ^ pa 
^ kp vk^ j ^ pa ' j— 

sg 

Petrov 2 has given an equivalent description of the four nonvanishing curvature 
invariants as roots of a secular equation, and has classified various algebraic types 
of Weyl tensor according to the degeneracies of these roots. 

Finally, it should be emphasized that (6.7.7) gives the number of algebraically 
independent curvature invariants. There are in general differential relations among 
these invariants, and the number of functionally independent curvature invariants 
is less than (6.7.7). 


8 The Bianchi Identities 

The curvature tensor obeys important differential identities, in addition to the 
algebraic identities discussed in Section 6. These can be most easily derived at a 
given point by adopting a locally inertial coordinate system in which F 2 V (but 
not its derivatives) vanish at x. Then at x f Eq. (6.6.1) gives 

r 1 8 ( d 2 g lv d*g uv d 2 g u d*g m \ 

Ap\K,ti 2 dx n \dx K dx^ dx K dx k dx^ dx v dx v dx k J 

all other terms being at least of first order in T. By permuting v, K, and rj cyclically 
we obtain the Bianchi identities 

■^kpvK;n RkptjviK T k/xK tj ; v ^ 


(6.8.1) 
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These equations are manifestly generally covariant, so since they hold in locally 
inertial systems they hold in general. (They can also, of course, be checked by 
direct calculation.) 

We shall be particularly concerned with the contracted form of (6.8.1). 
Recalling that the co variant derivatives of g Xv vanish, we find on contraction of X 
with v that 


Contracting again gives 
or 


R fiK • » Tt, lv . K 


+ ^ V «K«;v ~ 0 


~ R \: = 0 

i R \ - = 0 


(6.8.2) 


(6.8.3) 


An equivalent but more familiar form is 


(* mv - = 0 


(6.8.4) 


9 The Geometric Analogy* 

We have seen in this chapter that the non vanishing of the tensor ti kflVK is the 
true expression of the presence of a gravitational field. We also saw in Chapter 1 
that Gauss was led to introduce the Gaussian curvature K = — Rj 2 as the true 
measure of the departure of a two-dimensional geometry from that of Euclid, 
and that Riemann subsequently introduced the curvature tensor 22^ to generalize 
the concept of curvature to three or more dimensions. It is therefore not surprising 
that Einstein and his successors have regarded the effects of a gravitational field 
as producing a change in the geometry of space and time. At one time it was even 
hoped that the rest of physics could be brought into a geometric formulation, but 
this hope has met with disappointment, and the geometric interpretation of the 
theory of gravitation has dwindled to a mere analogy, which lingers in our language 
in terms like “metric,” “affine connection,” and “curvature,” but is not otherwise 
very useful. The important thing is to be able to make predictions about images 
on the astronomers 5 photographic plates, frequencies of spectral lines, and so on, 
and it simply doesn’t matter whether we ascribe these predictions to the physical 
effect of gravitational fields on the motion of planets and photons or to a curvature 
of space and time. (The reader should be warned that these views are heterodox 
and would meet with objections from many general relativists.) 

Despite the preceding remarks, it is worth mentioning without proof just what 
the tensor 22^ has to do with the curvature of a Riemannian space. Given a 
point A in a space of an arbitrary number of dimensions, and given two vectors 


* This section lies somewhat out of the book’s main -line of development, and may be omitted in a first 
reading. 
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a b p defined at X , we may construct through X a family of “geodesic curves” 
x p = ^{r, a, /?) defined by 


d 2 x M dx k 

dr 1 vA dr rfr 



aa* 4- Pb* 


the numbers a, /? being allowed to run over all real values. These curves fill out a 
two-dimensional surface S(a, b) through X, and the Gaussian curvature of this 
surface at X is 5 


K(a, b) = 


(9xk9 m ~ 9x J m )* l V‘a v b K 


(6.9.1) 


From Eq. (6.7.4) we see that in two dimensions K(a, b) is independent of a and b 
and is just — i?/2. 


1U Geodesic lieviation* 


The introduction of the curvature tensor was motivated here by the need to 
construct suitable field equations for the gravitational field. However, the curvature 
tensor is also useful in expressing the effects of gravitation on physical systems. 

For instance, consider a pair of nearby freely falling particles that travel on 
trajectories x p {r) and x p (r) + dx^ijr). The equations of motion are 


0 = 


d 2 x p 

dr 2 


+ r^(x) 


dx v dx k 
dr dr 


0 = — [*" + 5x“] + r« A (x + Sx) - [x v + Sx v ] - [x 2 + Sx 2 ] 
dr 1 dr dr 


Evaluating the difference between these equations to first order in 5x p gives 

„ d 2 8x» ar?, , . dx v dx 2 

0 = h — Sx p 

dr 2 dx p dr dr 

+ 2r 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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or, in terms of covariant derivatives along the curve a^(r) (see Section 4.9), 


D 2 

Dr 


dx' 


k dx v dxf 

w dr dr 


( 6 . 10 . 1 ) 


Although a freely falling particle appears to be at rest in a coordinate frame falling 
with the particle, a pair of nearby freely falling particles will exhibit a relative 
motion that can reveal the presence of a gravitational field to an observer that 
falls with them. This is of course not a violation of the Principle of Equivalence, 
because the effect of the right-hand side of (6.10.1) becomes negligible when the 
separation between particles is much less than the characteristic dimensions of 
the field. 
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“Of the general theory of 
relativity you will be 
convinced, once you have 
studied it. Therefore I am 
not going to defend it with 
a single word.” Albert 
Einstein, in a postcard to 
A. Sommer f eld, February 8, 
1916 


7 EINSTEIN’S 
FIELD EQUATIONS 


Chapters 3 through 5 have provided us with one-half of a complete theory 
of gravitation, that is, with a mathematical description of gravitational fields that 
dictates their effects on arbitrary physical systems. In this chapter we move on to 
the second half of general relativity, that is, to the differential equations that 
determine the gravitational fields themselves. 


1 Derivation of the Field Equations 

The field equations for gravitation are inevitably going to be more complicated 
than those for electromagnetism. Maxwell’s equations are linear because the 
electromagnetic field does not itself carry charge, whereas gravitational fields do 
carry energy and momentum (see Section 5.3) and must therefore contribute to 
their own source. That is, the gravitational field equations will have to be nonlinear 
partial differential equations, the nonlinearity representing the effect of gravitation 
on itself. 

In dealing with these nonlinear effects we are guided once again by the Principle 
of Equivalence. At any point X in an arbitrarily strong gravitational field, we 
can define a locally inertial coordinate system such that 

9<,e( x ) = ( 7L1 ) 

( Sg^ x ) \ 

V ox’’ ) x=x 


(7.1.2) 
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Hence for x near X, the metric tensor can differ from t} a p only by terms quadratic 
in x — X. In this coordinate system the gravitational field is weak near X , and 
we can hope to describe the field by linear partial differential equations. And once 
we know what these weak-field equations are, we can find the general field equations 
by reversing the coordinate transformation that made the field weak. 

Unfortunately, we have very little empirical information about the weak- 
field equations. This is not for any fundamental reason, but rather because 
gravitational radiation is so weakly generated and absorbed by matter, that it has 
not yet certainly been detected. However, although forgivable, our ignorance does 
prevent us from proceeding as directly as we did in previous chapters, and some 
guesswork will be unavoidable. 

First let us recall that in a weak static field produced by a nonrelativistic 
mass density p, the time-time component of the metric tensor is approximately 
given by 

goo - -(! + 2 4>) 

[See Eq. (3.4.5).] Here cj) is the Newtonian potential, determined by Poisson’s 
equation 

\ z (j) = inGp 

where G is Newton’s constant, equal to 6.670 x 10“ 8 in c.g.s. units. Furthermore, 
the energy density T 00 for nonrelativistic matter is just equal to its mass density 

Too - P 


Combining the above, we have then 

V 2 <7oo = —&7tGT 00 (7.1.3) 

This field equation is only supposed to hold for weak static fields generated by 
nonrelativistic matter, and is not even Lorentz invariant as it stands. However, 
(7.1.3) leads us to guess that the weak-field equations for a general distribution 
of energy and momentum take the form 

G af} = -SnGT afi (7.1.4) 

where G a p is a linear combination of the metric and its first and second derivatives. 
It follows then from the Principle of Equivalence that the equations which govern 
gravitational fields of arbitrary strength must take the form 

(7.1.5) 

where G is a tensor which reduces to G x p for weak fields. 

In general, there will be a variety of tensors G^ v that can be formed from the 
metric tensor and its derivatives, and that reduce in the weak-field limit to a given 
G a p. Let us imagine G^ v to be expanded in a sum of products of derivatives of the 
metric, and classify each term according to the total number N of derivatives of 
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metric components. (For example, a term with N = 3 could be linear in third 
derivatives of the metric, or a product of a first derivative with a second deriva- 
tive, or a product of three first derivatives.) The whole of G ^ must have the 
dimensions of a second derivative, so each term of type N =£ 2 appears multiplied 
with a constant having the dimensions of length to the power N — 2; such terms 
will become negligible for gravitational fields of sufficiently large or small space- 
time scale if N > 2 or N < 2, respectively. In order to remove the ambiguity in 
(r flv , we shall assume that the gravitational field equations are uniform in scale , so 
that only terms with N = 2 are allowed. 

Let us review what we know about the left-hand- side of the field equation 
(7.1.5): 

(A) By definition, G^ v is a tensor. 

(B) By assumption, G^ v consists only of terms with N = 2 derivatives of the 
metric; that is, G contains only terms that are either linear in the second 
derivatives or quadratic in the first derivatives of the metric. 

(C) Since T^ v is symmetric, so is G^ v . 

(D) Since T is conserved (in the sense of co variant differentiation) so is 

<*„■ 

<?% = 0 (7.1.6) 

(E) For a weak stationary field produced by nonrelativist ic matter the 00 

component of (7.1.5) must reduce to (7.1.3), so in this limit 

Goo =* V 2 ?00 (7.1.7) 

These properties are all we will need to find G . 

We saw in Section 6.2 that the most general way of constructing a field 
satisfying (A) and (B) is by contraction of the curvature tensor i? x „ VK . The anti- 
symmetry property of B XflvK discussed in Section 6.6 shows that there are only 
two tensors that can be formed by contracting R ktlVK \ that is, the Ricci tensor 
RflK = and the curvature scalar R = R Hence (A) and (B) require 

G^ to take the form 

G, v = Oft, + C^Jt (7.1.8) 

where C t and C 2 are constants. This is automatically symmetric [see Eq. (6.6.7)], 
so (C) tells us nothing new. Using the Bianchi identity (6.8.3) gives the covariant 
divergence of G^ v as 

so (D) allows two possibilities: either C 2 = — C 1 /2, or R. v vanishes everywhere. 
We can reject the second possibility, because (7.1.8) and (7.1.5) give 


- (C i + 4 C 2 )R = -SnGT\ 
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Thus if R. v = dR/dx v vanishes, then so must dT fl Jdx v , and this is not the case in 
the presence of inhomogeneous nonrelativistic matter. We conclude then that 
C 2 = —C 1 1 2, so (7.1.8) becomes 

G,, = Ci(*„v - ig, v R) (7.1.9) 

Finally, we use the property (E) to fix the constant C t . A nonrelativistic 
system always has ITy |T 00 |, so we are concerned here with a case where 
! G ,vl < |Gool» °r using (7.1.9), 

R U - iMifi 

Furthermore, we deal here with a weak field, so g a p ~ tj ap . The curvature scalar 
is therefore given by 

R — Rkk ~ Zoo — 2^Z — Rqo 
or 

R ~ 2R 00 (7.1.10) 

Using (7.1.10) and (7.1.1) in (7.1.9), we find 

£ 00 ~ 2<7 1 J2 00 (7.1.11) 


To calculate R 00 for a weak field we may use the linear part of i^ VJ( , given by 
Eq. (6.6.2) as 

' d*g Xl _ 8%^_ _ 8^ 8 \ k I 

dx K dx 11 dx K dx x dx v dx ^ dx v dx x j 



When the field is static all time derivatives vanish, and the components we need 
become 


R 


0000 


- 0 


R ;i 


i0j0 


1 d 2 ffoo 

2 8x‘ 8x j 


Hence (7.1.11) gives 

^oo — 2 C i {Rioio — ^oooo) — ^iV 2 ^oo 


and comparing this with (7.1.7), we find that (E) is satisfied if and only if C 1 = 1. 
Setting C l — 1 in (7.1.9) completes our calculation of O tiy : 

- ig^x (7.1.12) 

With (7.1.5), this gives the Einstein field equations 

Z, v - i g^R = -SnGT^ (7.1.13) 

An alternative form is sometimes useful. Contracting (7.1.13) with g^ v gives 

R - 2R = -SnGT^ 
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or 

JR = SnGTf (7.1.14) 


and using this in (7.1.13), we have 

i?„ v = -8 nG(T^ - \g, v T\) (7.1.15) 

Of course we can also go from (7.1.15) back to (7.1.14) and (7.1.13), so (7.1.13) 
and (7.1.15) should be regarded as entirely equivalent forms of the Einstein field 
equations. 

In a vacuum T flv vanishes, so from (7.1.15) we see that the Einstein field 
equations in empty space are just 


= 0 (7.1.16) 

In a space-time of two or three dimensions this would imply the vanishing of the 
full curvature tensor JR^ VK , and the consequent absence of a gravitational field. 
(See Section 6.4.) It is only in four or more dimensions that true gravitational 
fields can exist in empty space. 

We might be willing to relax assumption (B), and allow G ^ to contain terms 
with fewer than two derivatives of the metric. The freedom to use first derivatives 
does not allow any new terms in G^ v (see Section 6.1), but if we can use the metric 
tensor itself, then one new term is possible, equal to g^ v times a constant X. The 
field equations would then read 

- i = — 87c GT„ v 

The term Xg^ v was originally introduced by Einstein 1 for cosmological reasons 
(which have since disappeared) ; for this reason, X is called the cosmological constant . 
This term satisfies the requirements (A), (C), and (D), but does not satisfy (E), so 
X must be very small so as not to interfere with the successes of Newton’s theory 
of gravitation. Except in Chapter 16, I am assuming throughout this book that 
X = 0. 


2 Another Derivation* 

The derivation of Einstein’s equations in the last section made heavy use of 
the assumption that the left-hand side G flv is a tensor depending solely on the 
metric and its first and second derivatives. We might consider using a more general 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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tensor, which involves elements unrelated to the metric tensor or its derivatives, 
such as 

(-**- - 3 3 Sj*> . ) (7.2.1) 


where x ) are the coordinates locally inertial at X. (Reference to the precise 
definitions (3.3.2) and (3.3.3) of the metric and affine connection will show that 
(7.2.1) is not related to their derivatives.) Such a tensor could be constructed by 
writing 


where G ^ is the most general possible linear combination of second derivatives 
of the metric tensor in the £ x coordinate system allowed by Lorentz covariance 
and symmetry, that is, 


= ajn'B" + «z( + 

*" y<xp 2 \ 8 £* 8 £> 8 tf 8 p) 


v OCL * 

)£? dV) 


SU d? 


+ v, 


(7.2.3) 


with a t9 a 2 , a 3 , b lf b 2 five arbitrary dimensionless constants. [We have dropped the 
label X. Also, all indices are raised and lowered with the Minkowski tensors 
rj a P and v, a p, and Q 2 is the d’Alembertian Q 2 = j/“^(3/d<j;“)(d/3<^).] For perfectly 
general values of the five constants a t , a 2 , a 3 , b lt b 2 this 6r /iv would indeed depend 
on foreign elements such as (7.2.1). However, it is remarkable that by making 
use of energy-momentum conservation, and the validity of Newton’s theory for 
weak static fields produced by nonrelativistic matter, we can put such stringent 
requirements on the constants a l9 . . . , b 2 that the terms involving (7.2.1) drop 
out, and we get Einstein’s theory. 

In a weak field the requirement of energy and momentum conservation yields 
the ordinary conservation law dT a pld£“ = 0, and therefore the assumed field 
equations G aH = — Sn GT a p require that 


0 = i- 6 % = ( a , + a 2 )U 2 i- 9% + («2 + <H) 


+ (&, + WD 2 


dU d( l 


Hence a l + a 2 , a 2 + a 3 , and b r + b 2 must all vanish, giving 


G *n = “l (l 


□ 2 sw> - 


di a df 


iV. + ay* 

d£?d£> 
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To determine a l and h x we pass to the Newtonian limit. Tor a static field, (7.2.4) 
gives 

G ,1 + °00 = «i v 2 (g ti + g 0 o) - 6iV 2 (s , ii - goo) 

(Repeated Latin indices are summed over the values 1, 2, 3.) For a nonrelativistic 
material system | T is much less than |3^ 00 [ , so we obtain the field equation 

(®i + b l )V 2 g 00 + (a, - MVVii = -8 nOT 00 (7.2.5) 

We want the field equations in this limit to imply Newton’s law, 

V 2 <7oo = ~SnGT 00 


but (7.2.5) is the only one of the field equations to involve only g 00 and/or 
so we must require that a l — b 1 — The left-hand side of the weak-field equations 

is then 


G, 


1 f n 2 o -iV 

2 ( # dt’di* 8^8( y ' 8C8( f 


a 2 g* y 8 2 g\ 


af 


d? 8? 

But Eq. (6.6.2) shows that for a weak field the Ricci tensor is 


r> 2 a yd 

+ in* - □¥ 


r,„ = -in 2 ^ - 




5V + dV, 


d£, a d£, y dtf dV d{* dtf 


so (7.2.5) gives the field equation as 


(7.2.6) 




W* = -87 zGT t 




(7.2.7) 


The Principle of Equivalence then immediately yields the Einstein equations for 
a general field, 

- hfrJt = -87 iGT^ (7.2.8) 

for (7.2.8) is generally covariant and reduces in locally inertial coordinate systems 
to (7.2.6). Thus, if we want a more general equation than Einstein’s, which reduces 
in the weak-field limit to a second-order equation with (7.2.4) on the left-hand 
side, then we must pay the price of allowing new elements such as (7.2.1) to enter, 
and we must give up the possibility of deriving Newton’s theory as a limiting case. 


3 The Brans-Dicke Theory 

Long-range forces are known to be transmitted by the gravitational field 
g flv and by the electromagnetic potential A ^ It is natural then to suspect that 
other long-range forces may be produced by scalar fields. Such theories have been 
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suggested since before general relativity; this section describes the latest and 
possibly the best motivated theory in which a scalar field shares the stage with 
gravitation, that of Brans and Dicke. 2 

The starting point for Brans and Dicke is the idea of Mach, that the phenom- 
enon of inertia ought to arise from accelerations with respect to the general mass 
distribution of the universe. (See Section 1.3.) Thus the inertial masses of the 
various elementary particles ought not to be fundamental constants, but should 
rather represent the particles’ interaction with some cosmic field. But the absolute 
scale of the elementary particle masses (as opposed to their ratios, which presum- 
ably have nothing to do with cosmic fields) can be measured only by measuring 
gravitational accelerations Gm/r 2 , so an equivalent conclusion is that the gravita- 
tional constant G ought to be related to the average value of a scalar field 0 , 
which is coupled to the mass density of the universe. 

The simplest generally covariant field equation for such a scalar field would be 

□ V = (7.3.1) 

where Q 2 0 = 0. p ;p i s now the invariant d’Alembertian, X is a coupling constant, 
and Tf / v is the energy-momentum tensor of the matter (i.e. , everything but gravita- 
tion and the 0 -field) of the universe. We can make a rough estimate of the average 
value of 0 by computing the central potential of a gas sphere with the cosmic 
mass density p ~ 10“ 29 g cm” 3 and radius equal to the apparent radius of the 
universe R ~ 10 28 cm. (See Chapter 14.) This gives an average value 

<0> - XpR 2 - x 10 27 g cm" 1 (7.3.2) 

Note that 10 27 g cm” 1 is reasonably close to the constant 1/G — 1.35 x 10 28 
g cm” 1 ; hence we normalize 0 so that 

<</-> * f (7.3.3) 


and (7.3.2) then shows that X is a dimensionless number of order unity. These 
considerations led Brans and Dicke to suggest that the correct field equations for 
gravitation are obtained by replacing G with 1/0 and including an energy- 
momentum tensor T^ v for the 0-field in the source of the gravitational field: 

- W”R = - - n - [V + r/ v ] (7.3.4) 

We do not, however, wish to give up the successes of the Principle of Equiv- 
alence, such as the equality of gravitational and inertial mass, and the gravitational 
time dilation. Brans and Dicke therefore require that it is only g^, and not <0, 
that enters in the equations of motion of particles and photons. Therefore the 
equation describing the interchange of energy between matter and gravitation is 
the same as in Einstein’s theory: 


m p 
M v ;/z 


M v 


dx ** 


+ r - n//. = 0 


(7.3.5) 
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The Bianchi identities tell us that the left-hand side of Eq. (7.3.4) has vanishing 
covariant divergence, so by multiplying (7.3.4) by <j> and taking the covariant 
divergence, we find 

(*\ - WiflWw = -8*3%%., (7.3.6) 

This requirement proves sufficient to determine The most general 

symmetric tensor that can be built up from terms each of which involves two 
derivatives of one or two (p fields, and cp itself, is 


+ + d“ v D(4,)U 2 ^ (7.3.7) 

A straightforward calculation gives 

+ [Aw + D'm4> ; n 2 <t> 

+ [A{4>) + 2 B(4>) + C"(</>)]<^ ;v tf> ; „ 

+ Dm + C(0)D 2 (</>. v ) (7.3.8) 

(A prime here means the derivative with respect to cp.) The first term of Eq. 
(7.3.6) is determined by Eq. (6.5.2) as 

- K;"* = (O 2 <!>),* - \J 2 (<t>-J (7.3.9) 

Also, by taking the trace of Eq. (7.3.4) and using (7.3.1), we find 

JR = DV + (A(<f>) + 4B(<l>))<t> *<!>.„ + (C(4>) + 4£>(^)0> 

<p [_47cl 

so the left-hand side of (7.3.6) is 


(E\ - 

= (D^);v 


4tt 

J 




□ 2 M<;v) 


(T- + 0{4>) + 4 u 2 4> + (M4>) + 


(7.3.10) 


By comparing the coefficients of 0 2 </>). v5 □ 2 (</ > ;v ) 5 <£;vD 2 0> an( ^ 

<p“ ;v (p ;fl in Eqs. (7.3.8) and (7.3.10), we find that Eq. (7.3.6) requires 

1 = —8nD((p) 


-1 = -8 kC((P) 

+ CM + 4 D(4,)\ = -8 n(A(4,) + D'm 

<p \4tU ) 

A.rr 

- — (A(4>) + 4 B(4>)) = -8n(A'(4>) + B'W) 

<t> 


0 = A^) + 2 B{<t>) + C'(4>) 
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The unique solution is 


AW 

CO 

R(<t>) = 

(O 

Sncj) 

lftncj) 

C(4>) 

1 

87T 

DW = 

1 

8n 


(7.3.11) 


where co is a convenient dimensionless constant given by 

1 3 


or 


co = 


X = 


X 2 
2 


3 + 2co 

The field equations (7.3.1) and (7.3.4) of the Brans-Dicke theory now read 

8 7T 


(7.3.12) 


□ 2 </> 


3 + 2c o 


rp ti 
- 1 M ft 


(7.3.13) 


R p, - iffpyR = - ^ - zff fiv& ;p*fi ; P ) 


CO 


1 


- T - SVv □>) 
9 


(7.3.14) 


Our previous estimate indicated that X is of order unity, so we expect that co is of 
order unity. If co is much larger than unity, then (7.3.13) gives = 0(l/co), 

and therefore 


4> = <<£> + o(T) - 1 + o(! 

\(o/ G \oj 


(7.3.15) 


Using this in (7.3.14) gives then 

R p, ~ i ffp,R = SnGT Mltv + o(Ej 

Thus the Brans-Dicke theory goes over to the Einstein theory in the limit co -y oo . 

It must be stressed that the role of the scalar field in the Brans-Dicke theory 
is confined to its effect on the gravitational field equations. Once g^ v is calculated, 
the effects of gravitation on arbitrary physical systems are to be determined 
exactly as described in Chapters 3 through 5. 

Throughout most of this book it will be assumed that there is no scalar field 
cp that contributes to long-range interactions. However, from time to time we 
return to the Brans-Dicke theory in order to see what changes it would make in the 
predictions of general relativity. 
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4 Coordinate Conditions 


The symmetric tensor G^ v has 10 independent components, so Einstein’s 
field equations (7.1.13) comprise 10 algebraically independent equations. The 
unknown metric tensor also has 10 algebraically independent components, and at 
first sight one would think that the Einstein equations (with appropriate boundary 
conditions) would suffice to determine the g uniquely. However, this is not so. 
Although algebraically independent, the 10 G^ v are related by four differential 
identities, the Bianchi identities [see Eq. (6.8.3)]: 


G* 


= 0 


Thus there are not 10 functionally independent equations, but only 10 — 4 = 6, 
leaving us with four degrees of freedom in the 10 unknowns g flv . These degrees of 
freedom correspond to the fact that if g is a solution of Einstein’s equation, then 
so is g ' v , where g is determined from g ^ by a general coordinate transformation 
x -* x'. Such a coordinate transformation involves four arbitrary functions x ,fl (x), 
giving to the solutions of (7.1.13) just four degrees of freedom. 

The failure of Einstein’s equations to determine g uniquely is closely 
analogous with the failure of Maxwell’s equation to determine the vector potential 
uniquely. When written in terms of the vector potential, Maxwell’s equations 
read 


a 2 A 


J 2 

dx* dx p 


A p = -J a 


(7.4.1) 


[See Eq. (2.7.6) and (2.7.11).] There are four equations for the four unknowns, 
but they do not determine A a uniquely, because the left-hand sides of these 
equations are related by a differential identity analogous to the Bianchi identities : 


d_ 

dx* 


□ 2 A* 


-^-,4.0 

dx a dx p j 


Thus the number of functionally independent equations is really only 4—1=3, 
and there is one degree of freedom in the solution for the four A a . This degree of 
freedom of course corresponds to gauge invariance; given any solution A a , we 
can find another solution A' = A„ + dA/dx *, with A arbitrary. 

The ambiguity in the solutions of Maxwell’s and Einstein’s equations can be 
removed by main force. In the case of Maxwell’s equation we do this by choosing 
a particular gauge. For instance, given any solution A x . we can always construct 
a solution A' such that 

d a A'“ = 0 (7.4.2) 


A'^A a + 


m 

dx* 


by setting 
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where O is defined by 


□ 2 0 > = 


dA* 

dx* 


Such a solution is said to be in the Lorentz gauge. The condition (7.4.2) when added 
to the three independent equations (7.4.1) completes a system of four equations 
that, with appropriate boundary conditions, will generally determine the four 
A a uniquely. In the same way, we can eliminate the ambiguity in the metric 
tensor by adopting some particular coordinate system. The choice of a coordinate 
system can be expressed in four coordinate conditions, which, when added to the 
six independent Einstein equations, determine an unambiguous solution. 

One particularly convenient choice of a coordinate system is represented by 
the harmonic coordinate conditions 

O EE = 0 (7.4.3) 

To see that it is always possible to choose a coordinate system in which this holds, 
we recall the transformation equations of the affine connection 

, A _ 8x^ ctf_ 8tf_ _ 8tf_ 8 2 x a 
~~ dx" dx^dx'" z * 8x' v dx'* 8x“ 8xf 


[See Eq. (4.5.8).] Contracting this with gr' ,iV , we find 

r* = d *l p- - «r aV " 

5x» 8x " dx" 


(7.4.4) 


Hence if T p does not vanish, we can always define a new coordinate system x ,k 
by solving the second-order partial differential equations 

dx p dx a dx p 

and Eq. (7.4.4) then gives r ,A — 0 in the af-system. 

The four conditions (7.4.3) are of course not generally covariant, since then- 
purpose is to remove the ambiguity in the metric tensor owing to the general 
covariance of the Einstein equations. Although we cannot write them as co variant 
equations, they can be put in a somewhat more elegant form by expressing the 
affine connection in terms of the metric tensor : 


r A = i/y* 


3g t 


7K(1 

dx" 


+ 


^9 K V 

dx p 



qkK dg Kf i 

dx v 


9k(1 


dg ^ 
dx v 


ox* dx * 


We recall that 
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[See Eq. (4.7.5).] This then gives 


r A = -9- ll2 ~(9 ll Y') 

dx K 


and the harmonic coordinate conditions read 


dx K 


(sJ~9 9 iK ) = 0 


(7.4.5) 


(7.4.6) 


We are now in a position to explain the term “harmonic coordinates.” A 
function (j) is said to be harmonic if Q 2 $ vanishes, where Q 2 is the invariant 
d’Alembertian, defined by 

□ 2 <Z> = (9^a); K (7*4.7) 


Using (4.7.1), (4.7.7), and (7.4.5), this is 


□ 2 4 > = g iK 


8 2 (j) 

dxYz* 


r j M 
dx* 


(7.4.8) 


If r A = 0 then the coordinates are themselves harmonic functions, 

□ v = 0 (7.4.9) 

thus justifying our application of the adjective “harmonic” to such coordinate 
systems. 

In the absence of gravitational fields, the obvious harmonic coordinate system 
is that of Minkowski, in which g Alc = r\ kK and g = 1, so that (7.4.6) is satisfied 
trivially. In the presence of weak gravitational fields the harmonic coordinate 
systems may be pictured as nearly Minkowski an. Another related advantage of the 
harmonic coordinate condition is that, as shown in Chapters 9 and 10, its use 
produces a very great simplification in the weak-field equations, similar to the 
simplification brought to Maxwell’s equations by use of the Lorentz gauge. 


5 The Cauchy Problem 

We can gain further insight into the mathematical content of Einstein’s 
equations by applying them to the traditional initial value problem of Cauchy. 
Suppose that we are given g Mv and dg^/dx 0 everywhere on the “plane” x° = t. 
If we could extract from the field equations a formula for d 2 g flv ld(x 0 ) 2 everywhere 
at x° = t, we could then compute g ux and dg^Jdx 0 at a time x° = t + St. and 
by continuing this process g could be computed for all x l and x°. 

At first sight this looks feasible, because we need 10 second derivatives, and 
there are 10 field equations. But let us look more closely at the left-hand side 
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guv = _ igVvfi 0 f the field equations. The Bianchi identities (6.8.4) tell us 

that 


_d_ 

dx° 


G»° = 


dx l 


a ui 


- r t f ; G XV - 


The riglit-hand side contains no time derivatives higher than d 2 ld{x 0 ) 2 , so neither 
does the left-hand side, and therefore G^° contains no time derivatives higher 
than d/dx°. Thus we cannot learn anything about the time evolution of the gravita- 
tional field from the four equations 

G * 0 = ~87tGT^° (7.5.1) 

Rather, these equations must be imposed as constraints on the initial data, that is, 
on g MV and dg^Jdx 0 at x° = t. 

This leaves as “dynamical'* equations only the other six Einstein equations 

(?> = -8 nGT ij (7.5.2) 

When we solve these equations for the 10 second derivatives d 2 g tlv ld{x°) 2 , we 
must encounter a fourfold ambiguity, which of course we could not have hoped to 
escape since it is always possible to make coordinate transformations that leave 
g and dg^Jdx 0 unchanged at x° — t but that do alter everywhere else. To be 
more specific, what we find is that (7.5.2) determines the six d 2 g iJ ld{x G ) 2 , "but leaves 
the other four derivatives d 2 g fl ° ld{x°) 2 indeterminate. This ambiguity can be 
removed by imposing four coordinate conditions that fix the coordinate system. 
For instance, if we adopt the harmonic coordinate condition discussed in the last 
section, the second time derivative of yjg g ,l ° can be determined by differentiating 
(7.4.6) with respect to time: 

Sv ‘ - |7 "> 

and the 10 equations (7.5.2) and (7.5.3) suffice to determine the second time 
derivatives of all g . 

When the initial value problem is solved in this way, the constraints (7.5.1) 
on the initial data need only be imposed once. The Bianchi identities and the 
conservation of energy and momentum tell us that whether or not the Einstein 
field equations are satisfied, we must have 

{G^ + SnGT^) ;v = 0 

Let us apply this at x° = t. Imposing the initial data constraints (7.5.1), and 
determining the second derivatives from (7.5.2), the quantity in brackets will 
vanish everywhere at x° = t, so this gives 

j- o (G"° + 87rGT"°) = 0 at x° = t 
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and the fields computed at x° = t -f dt will therefore automatically also satisfy 
the constraints (7.5.1). Thus this method of solving the initial value problem is one 
that can be programmed for an automatic computer, once we find an initial metric 
at a : 0 = t that satisfies the constraints (7.5.1). 


6 Energy, Momentum, and Angular Momentum of Gravitation 

The physical significance of the Einstein equations can be clarified by writing 
them in an entirely equivalent form that, because not manifestly covariant, 
reveals their relation to the wave equations of elementary particle physics. Let us 
adopt a coordinate system that is quasi-Minkowskian, in the sense that the metric 
g fiv approaches the Minkowski metric rj at great distances from the finite material 
system under study. (This is the case in harmonic coordinate systems, and others 
as well.) We then write 

+ V (7.6.1) 

so that h^ vanishes at infinity. (However, h^ v is not assumed to be small every- 
where.) The part of the Ricci tensor linear in h ^ is then 

( 7.fl. 2) 

“ 2 va*" 8x« 8x x 8x K 8x x 8x“ 8x x 8xJ 

[See Eq. (6.6.2). We are adopting the convenient convention that indices on 
h v , and djdx A are raised and lowered with rfs, for example, h x x = rj^ v h Xv 

and d/dx x = r] Xv 8/dx v , whereas indices on true tensors such as R are raised and 
lowered with ( 7 ’s as usual.] The exact Einstein equations can then be written as 

Z (l \ K - = - 87 xO[T„ + t m ] (7.6.3) 

where 

= r~ (7.6.4) 

oTtix 

Equation (7.6.3) has just the form we should expect for the wave equation of a 
field of spin 2 (see Section 10 . 2 ) but with the peculiarity that its “source” T MK + 
depends explicitly on the field h^ v . We interpret this feature by saying that the 
field A mv is generated by the total densities and fluxes of energy and momentum. 
and t flK is simply the energy -momentum “ tensor ” of the gravitational field itself. That 
is, we interpret the quantity 


* Vtzv + y 


(7.6.5) 


as the total energy-momentum “tensor” of matter and gravitation. There are 
several properties of r vA that support this interpretation: 



i66 


7 Einstein’s Field Equations 


(A) The quantities obey the linearized Bianchi identities: 

— [iJ (1)vi - W XR{l)ll u\ = 0 (7.6.6) 

dx v 


It therefore follows from the field equations (7.6.3) that t vA is locally conserved: 

— = 0 (7.6.7) 

Sx v 

Xote that although T vA obeys the covariant conservation law P vA . v = 0, which 
really describes the exchange of energy between matter and gravitation, the quantity 
t vA is conserved in the ordinary sense. In particular, for any finite system of 
volume V bounded by a surface S, Eq. (7.6.7) tells us that 

% 

T U n t dS (7.6.8) 

s 

where n is the unit outward normal to the surface. Hence we may interpret 


- t oa d 3 x = - 
dt v 


P A = 



(7.6.9) 


as the total energy- momentum “vector” of the system, including matter, electro- 
magnetism, and gravitation; t iA is the corresponding flux. 

(B) Besides being conserved, t vA is also symmetric, 

t vA = t Av (7.6.10) 

and therefore 

A = 0 (7.6.11) 

dx 11 

where 

^ T ^ x v _ T nv x x (7.6.12) 

We can thus interpret M 0vX and M lvX as the density and flux of a total angular 
momentum 

J vi = j d^xM 0 ^ = -<7 Av (7.6.13) 

that is constant if M lvX vanishes on the surface of the volume of integration. 

(C) We can compute as a power series in h , and find that the first term is 
quadratic : 

t*c = - i<wr# 2> ,J + o(h 3 ) 


(7.6.14) 
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where Ei {2, „ K 


is the second-order part of the Kicci tensor, given by (6.6.2) as 


-r <2) „k = 


d 2 h 


Av 


d 2 h 


m >_ — 


d 2 h : 


d 2 h 




dx K dx p dx K dx 2 dx v dx p dx v dx 2 


+ ir 2 ^2 _ _ dh M 1 

4 1_ dx v ftc* J |_ das* dx p dx a j 

_ ipi + skx „ + di^ _ dh\ 

4 [_ J [_ da; M 


(7.6.15) 


The example of electrodynamics would have led us to expect the energy- momentum 
“tensor” of gravitation to start with a term quadratic in h^ v . [Compare Eq. 
(2.8.9).] The presence in t^ K of terms of third and higher order simply means that 
the gravitational interaction of the gravitational field with itself also contributes 
to the total energy and momentum. Of course, when the gravitational field is weak, 
h^ v is small, so our inclusion of t Xv in (7.6.5) (and our use of t] to raise indices) does 
not seriously change our picture of the energy-momentum content of physical 
systems. 

(D) Though not generally covariant, t , t x2 , and M px2 are at least Lorentz- 
co variant. Thus for a closed system P 2 and J x2 are not only constant, but also 
Loren tz -co variant. (See Section 2.6.) 

(E) We chose at the beginning of this section to work in a coordinate system 
in which h^ v vanishes at infinity. Far away from the finite material system that 
produces the gravitational field, T is zero and t^ K is of order h 2 , so the source 
term on the right-hand side of the field equations (7.6.3) is effectively confined 
to a finite region. This suggests that in a large variety of physical problems h ^ 
will behave at great distances as do the potentials in electrostatics or Newtonian 
gravitational theory, that is, for r -> oo, 


K* = o 


S?-° 




(7.6.16) 


In this case, (7.6.14) shows that 


o 


(7.6.17) 


so the integral J r 02 d 3 x that gives the total energy and momentum converges. 
This is why it was so important to identify the coordinate system as quasi- 
Minkowskian; if g MV approached the metric of spherical polar coordinates at 
infinity, then our definitions (7.6.1) and (7.6.4) would have led to a gravitational 
energy density concentrated at infinity! (Note though that (7.6.16) and (7.6.17) 
are not always valid. If the system is eternally radiating gravitational waves 
(see Chapter 10), then h^ v oscillates so that dh^Jdx 2 and d z h fl Jdx 2 dx p are of the 
same order as h^ v , giving an infinite total energy, which is what we would expect 
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for gravitational radiation filling all space. In this case not even h behaves like 
l/r.) 2 * 

(F) By its construction, t vA is clearly the energy-momentum “tensor” we 
determine when we measure the gravitational field produced by any system. 
Indeed, there are many possible definitions of the energy- momentum “tensor” 
of gravitation that share most of the good properties of our $ (these definitions 
are usually based on the action principle; see Chapter 12), but t pK is specially 
picked out by its role in (7.6.3) as part of the source of h pv . 

(G) Although calculation of t K in specific physical problems can be a nuisance, 
it is fortunately possible to avoid this calculation if all we want is the total energy 
and momentum of the system. The left-hand side of the field equations (7.6.3) 
can be written as 



^(l)vA _ , = A q„vX 

" 8x» 


(7.6.18) 

where 



Q pvX = i , 

\8h\ „x 8h\ 8h^ „ x 8V vA 8h vl 

| dx v dx p dx p dx tl dx p 

dh 1 ’*') 

8xJ 

(7.6.19) 

Note that Q pv 2 is antisymmetric in its first two indices, 




Qp vA ^ -Q vpX 


(7.6.20) 


from which follows the differential identity (7.6.6). By using the field equations 
(7.6.3) in conjunction with (7.6.18) we find for the total energy- momentum 
“vector” (7.6.9) the value 

J - = _ _l r waid'z- - — 

SnG J y dx p SkG ^ 

and using Gauss’s theorem gives 

P l = — I Q i0 \r 2 dSl (7.6.21) 

SnG J 

the integral being taken over a large sphere of radius r, with n the outward normal 
and dQ the differential solid angle ; that is, 

r = {xfi-j) 112 n i = — dQ = sin 6 dQ dtp 
r 

(Repeated Latin indices are summed over 1, 2, 3.) In greater detail, the total 
energy and momentum are given by (7.6.19) and (7.6.21) as 



pi = - 


1 

16nQ 


d^kk ^ ^ dhkp ^ _ dh J0 dh { - 

8t iJ 8x k iJ 8x‘ 8t 


np 2 dQ 


(7.6.22) 


P° 


1 


~ 8 Au 

dx l 


^4 nf 2 dQ 

dx J \ 


(7.6.23) 


\6nG 
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±Jy the same reasoning, the total angular momentum “tensor’ 5 (7.6.13) is 


jvA 



dQ i0v \ 

dx l ) 


As remarked in Section 2.9, the physically interesting components of J v; are the 
three independent space-space components: 


J x = J 23 J 2 = J 31 J 3 = J 12 


Using Gauss’s theorem again, these components are given by 

Tik 1 f" ~Er.fr ~E'(\ ; 

J J = <! -X: — ^ + X k 

16nG J I J dx l dx l 


+ x i _ Xk + h ° k ~ h °> ^'4 nf2 d£l 

Ot Ot J 

(7.6.24) 


Thus, in order to calculate the total momentum, energy, and. angular momentum 
of an arbitrary finite system, it is only necessary to know the asymptotic behavior 
of Ji at great distances. 

(H) It has been shown that P° is always positive , and takes the value zero 
only for matter-free empty space. 3 

(I) Although t vA is not a tensor and P k is not a vector, the total energy and 
momenta have the important property of being invariant under any coordinate 
transformation that reduces at infinity to the identity. Such a transformation will 
be of the form 

X P = X H + 


where e M (:r) vanishes as r -> 00 , although e M (a;) need not be small at finite distances. 
The metric tensor in the new coordinate system is 



For r -> 00 both 8 P and h are small, so we can calculate g' pv to first order in e p 
and h fiv by setting g pa ~ rj pa — h pa and expanding; this gives 


where 


g' pv ~ rj pv - h ,pv 


ds* _ cW 
dx v dx^ 


h' pv = W 
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The change in the quantity (7.6.19) produced by this coordinate transformation is 
then given for r -» oo by 


A Q pvX = 


1 


dV 

dx M dx. 


rj px + 


d 2 e 

dx tl dx , 


t] vX + \J 2 bY x 


f— 12 o vA 

- n + — 


dx p dx k dx v dx j 


or 


where 


AQ pvX = _E_ J) apvX 
dx a 


P" pvX = 1 I- r { px + — q vX + 


pA M vA ^£ V ak <rk 

rf x r, Vf + r, 

dx„ ox a dx Sx v 


2 1 dx, ’ dx p 
We note that D is totally antisymmetric in its first three indices 

jyapvk j^paxk j^axpk _ jyvpak 

and therefore the change in the surface integral takes the form 


A P x = - 


1 

8nG 

1 

SnO 


dD 


oiQk 


dx c 

dD ji0 X 

dx J 


,{r 2 d£l 


n ; r 2 dQ 


or, using Gauss’s theorem again, 


A P x = -- 

SnG 


dx 1 dx J 


(7.6.25) 


We may note as a corollary that P ? transforms as a four- vector under any trans- 
formation that leaves the metric rj at infinity unchanged, because any such 
transformation can be expressed as the product of a Lorentz transformation 
x u — r /\'\ : x v + under which P k transforms as a four-vector (see (D) above), 
times a transformation that approaches the identity at infinity and hence does 
not change P x . 

(J) If the matter in our system is divided into distant subsystems S n , the 
gravitational field can be approximated by writing h pv as the sum of the ^” v ’s 
that would be produced by each subsystem acting alone. (Interference terms 
between these different ^" v ’s may be neglected in £ , because any place where 
one A” v is large, all others are small.) It follows then from the calculation of P J 
in (E) above that the total energy and momentum are equal to the sum of the 
values P n k for each subsystem alone. 
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The energy-momentum “vector” P A defined by (7.6.9) is conserved, is a 
Lorentz four- vector, and is additive. What more could we ask ? Any four quantities 
with these properties are uniquely determined to be the usual momentum and 
energy (as can be shown formally by applying the conservation laws to a collision 
in which distant subsystems come together, interact, and then go off to infinity 
again 4 ). 

The arguments of this section can be turned around to provide yet another 
derivation 5 of Einstein’s field equations. Suppose that we set out to construct 
equations for a long-range field of spin 2. General group -theoretic considerations 
require them to take the form 6 

(7.6.26) 

with 0 MK some source function, which because of the identities (7.6.6) must be 
conserved 

A © 0 (7.6.27) 

It will not do to set 0 MK proportional to the energy- momentum tensor T of 
matter alone, because matter can interchange energy and momentum with 
gravitation, and therefore T MK does not satisfy (7.6.27). We must include in 0 MK 
terms involving h itself, and when these terms are calculated by imposing the 
condition (7.6.27), we find that the field equation (7.6.26) must be simply (7.6.3), 
which is equivalent to Einstein’s theory. We are thus led back to the remark at 
the beginning of this chapter, that the major difference between the electro- 
magnetic and gravitational fields is that the source of the electromagnetic potential 
A a is a conserved current J a that does not involve A x because the electromagnetic 
field is not itself charged, whereas the source of the gravitational field h^ is a 
conserved “tensor” that must involve h^ because the gravitational field does 
carry energy and momentum. 
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PART THREE 
APPLICATIONS OF 
GENERAL RELATIVITY 




“The popular mind, in ali 
times and countries, has 
always tended to go by 
numbers in estimating the 
weight of evidence.” 
Wigmore on Evidence 


8 CLASSIC TESTS OF 
EINSTEIN’S THEORY 


Einstein suggested three tests of general relativity : 

(A) The gravitational red shift of spectral lines. 

(B) The deflection of light by the sun. 

(C) The precession of the perihelia of the orbits of the inner planets. 

Since then, one other test has been carried out : 

(D) The time delay of radar echoes passing the sun. 

And another soon will be : 

(E) The precession of a gyroscope in orbit around the earth. 

All five tests are carried out in empty space and in gravitational fields that are to 
a good approximation static and [except for (E)] spherically symmetric, so our 
first task will be to solve the Einstein vacuum field equations under the simplifying 
assumptions of isotropy and time independence. The results will then be used to 
treat tests (B) through (TV). We have already seen in Chapter 3 that (A) tests only 
the Principle of Equivalence, so it need not be considered further here, whereas 
(E) involves anisotropic effects owing to the rotation of the earth, and will be 
discussed in Chapter 9. 


1 The General Static Isotropic Metric 

For the moment we put aside Einstein’s equations, and consider what is the 
most general metric tensor that can represent a static isotropic gravitational field. 
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By “static and isotropic” we mean that it must be possible to find a set of “ quasi - 
Minkowskian” coordinates x 1 ; x 2 , x 3 , x° = t, such that the invariant proper time 
dr 2 = —g^ dx * dx v does not depend on t, and depends on x and dx only through 
the rotational invariants dx 2 , x • dx, and x 2 . The most general proper time interval 
is then 

dr 2 = F{r) dt 2 — 2 E(r) dtx' dx 

— D(r)(x • dx) 2 — G(r) dx 2 (8.1.1) 

where F, E, D, and C are unknown functions of 

r s (x-x) 1/2 

(Scalar products of three-vectors are throughout this chapter defined as usual, 
e.g., x • dx = x l dx 1 + x 2 dx 2 + x 3 dx 3 , etc.) A deeper derivation of Eq. (8.1.1) 
will be given in Chapter 13; for the present we can regard (8.1.1) as a definition of 
what we mean by a static isotropic metric, or alternatively as an ansatz that 
allows us to find some solutions of the field equations. 

It is convenient to replace x with spherical polar coordinates r, 9, cp, defined 
as usual by 

x 1 = r sin 6 cos cp x 2 = r sin 6 sin (p x 3 = r cos 6 
The proper time interval (8.1.1) then becomes 
dr 2 = F(r) dt 2 — 2 rE(r) dt dr 

— r 2 D(r) dr 2 — C(r) (dr 2 + r 2 dO 2 + r 2 sin 2 9 dcp 2 ) (8.1.2) 
We are free to reset our clocks by defining a new time coordinate 

t' = t + d>(r) 

with an arbitrary function of r. This allows us to eliminate the off-diagonal 
element g tr by setting 

dO rE(r) 

dr ~ F(r) 

The proper time (8.1.2) then becomes 

dr 2 = F(r) dt' 2 — G(r) dr 2 - C(r) (dr 2 + r 2 d9 z + r 2 sin 2 9 dcp 2 ) 

(8.1.3) 

where 

G(r) = r 1 (V) + 

V ) 

We are also free to redefine the radius r, and thereby impose one further 
relation on the functions F, G , and C. For instance, suppose that we define 
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Then the proper time (8.1.3) takes what is called the standard form 
dr 2 — B{r') dt f2 — A(r') dr' 2 — r' 2 (d6 2 + sin 2 9 dtp 2 ) 


where 


jB(r') = F(r) 

A(r') = ( 1 + — Vl + 


r dC(r )\~ 2 
Cir))^ ’’’ 20Sr) J 


Alternatively, we could define 

r" = exp 


l + G(r)y /2 dr 
C{r)J r 


and (8.1.3) would then appear in what is called the isotropic form, 

dr 2 = H{r”) dt' 2 — J(r ") {dr" 2 + r" 2 d6 2 + r" 2 sin 2 6 dtp 2 ) 
H{r") = F(r) 
j (/') ^ C(r)r 2 


where 


r h 


>2 


(8.1.4) 


(8.1.5) 


We shall do most of our work with a metric of the “standard” form: 

dr 2 = B(r) dt 2 — A{r) dr 2 — r 2 (d6 2 + sin 2 6 dtp 2 ) (8.1.6) 

(We drop primes on r and t from now on.) The metric tensor has the non vanishing 
components 

£7rr = A (r) gee = r 2 9 V <, = r 2 sin2 6 g„ = -B(r) (8.1.7) 

with functions A (r) and B{r) that are to be determined by solving the field equations. 
Since g^ v is diagonal, it is easy to write down all the nonvanishing components of 
its inverse : 


g rr = A 1 (r) g ee = r 2 g 99 = r 2 (sin 0) 2 g u = —B x (r) 

( 8 . 1 . 8 ) 


Furthermore, the determinant of the metric tensor is — g, where 

g — r 4 A(r)B(r) sin 2 6 (8.1.9) 

so the invariant volume element is 


\/g dr dO dcp = r 2 \J A(r)B(r) sin 0 dr d9 dtp 


( 8 . 1 . 10 ) 
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The affine connection can be computed from the usual formula: 

UV 2 9 


r A = \g kp 


dx v dx p dx p 
Its only nonvanishing components are 

1 dA{r) 


r r = 

A rr 


r r = 

A <P<P 


2A{r) dr 
r sin 2 6 


p0 t-0 

1 rO ~ 1 Or ~ 


_ p'P _ 
-*■ tpr ± r(p 


A{r) 

1 

r 

1 


FL - 


rr, = 


r 

A{r) 

1 dB(r) 


2A(r) dr 


rj v = — sin 9 cos 6 


r% = n* = cot e 


1 dB(r) 


r,r ^ 2 B(r) dr 

We also need the Ricci tensor. It is given by (6.2.4) and (6.1.5) as 


ft _ UL fU UK I p>J pA pjj pA 


U& kj] A uk Afy 


( 8 . 1 . 11 ) 


(8.1.12) 


(Note that despite its appearance, the first term is symmetric in ft and k , because 
(4.7.6) gives equal to \ d In g/dx p .) Inserting in (8.1 .12) the components of the 
affine connection given by (8.1.11), we find 


= BAr) _ 1 /^(r)\ (AW + B^)\ _ 1 /A>)\ 
2B(r) 4 \R(r) J yA(r) B(r) J r yA(r) J 


Rge = _! + ^_f_ m + m) + jl 

2A(r) y A(r) B(r) J A(r) 

R„ = sin2 Q R ee 

R + 1 (?M\ Mw + B M\ _ 1 

” 2 A (r) 4 \.4 (r) J y:4 (r) B(r) J r 04 ( r) J 

B gv = 0 for pi # v 

(A prime now means differentiation with respect to r.) The results that R re , R r(ftt 
R te , R t(p> and R 6(p vanish, and that R ^ = sin 2 6R ee , are merely consequences of 
the rotational invariance of the metric, whereas the result that R rt vanishes is 
because we have set our clocks so that the metric is invariant under the time- 
reversal transformation t -> —t. 
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Neither the standard nor the isotropic coordinates are harmonic, but we can 
easily use the results (8.1.7) and (8.1.11) for the metric and affine connection in 
standard coordinates to construct harmonic coordinates X lt X 2 , X 3 , t. We set 

X i = R(r) sin 6 cos cp X 2 = R(r) sin 6 sin cp X 3 = R(r) cos 9 

(8.1.14) 

A straightforward calculation gives then 

□ 2 Xj = g*' ^ X ‘ r\ —1 

' \_8xf l dx" 8x l J 

[abJ [\2R r 2AJ r 2 J 

Also, the standard time coordinate t satisfies 


D 2 t = 0 

Thus the coordinates X lt X 2 , X 3 , t are harmonic if R(r) satisfies the differential 
equation 

-(r 2 B ll2 A~ 1 ' 2 d T\ - 2A l/1 B l/2 R = 0 (8.1.15) 

dr\ dr J 


In these harmonic coordinates the proper time (8.1.6) becomes 


dr 2 = B dt 2 


dX 2 - 


A r 2 ~ 
R 2 R f2 ~ R 4 


(X-dX) : 


(8.1.16) 


2 The Schwarzschild Solution 

We now apply the Einstein field equations to the general static isotropic 
metric. We use the standard form discussed in the last section, that is, 


dr 2 = B{r) dt 2 - A{r) dr 2 - r 2 dO 2 - r 2 sin 2 0 dtp 


(8.2.1) 


The field equations for empty space are 


R — 0 


( 8 . 2 . 2 ) 


The components of the Ricci tensor are given for this metric by Eq. (8.1.13). 
We see that it will suffice to set R rr , R ed , and R tt equal to zero. We also see that 


Rfr _|_ ^tt 

A 8 


A \ 1 B I 


(8.2.3) 
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so (8.2.2) requires that B'/B = —A' I A, or 

A(r)B(r) = constant (8.2.4) 

Furthermore, we impose on A and B the boundary condition that for r -► oo 
the metric tensor must approach the Minkowski tensor in spherical coordinates, 
that is, 

lim A(r) = lim B(r) = 1 (8.2.5) 

r-* oo r-* oo 

From (8.2.4) and (8.2.5) we have then 

AM = — (8.2.6) 

B(r) 


Since (8.2.3) now vanishes, it remains to make R rr and R ee vanish. Using (8.2.6) in 
(8.1.13), we find 


Roe = - 1 + B’(r)r + B(r) 

* = + _ V) 

rr 2 B(r) rB(r) 2 rB(r) 


(8.2.7) 

( 8 . 2 . 8 ) 


so it is sufficient to set R eg equal to zero, that is, 


The solution is 


- (rB(r)) = rB’(r) + B(r) = 1 
dr 


rB(r ) = r + constant 


(8.2.9) 


To fix the constant of integration we recall that at great distances from a central 
mass M, the component g tt = —B must approach —1 — 2</>, where (f> is the 
Newtonian potential —MGjr. (See Section 3.4.) Hence the constant of integration 
is —2MG, and our final solution is 


B(r) 

A(r) 



The full metric is given by 


dr 1 


1 


r 



2 MG]- 1 

r 


dr 2 


(8.2.10) 

( 8 . 2 . 11 ) 


r 2 d6 — r 2 sin 2 6 dcp 2 

( 8 . 2 . 12 ) 


This solution was found by K. Schwarzschild in 1916. 

The Schwarzschild solution is expressed in Eq. (8.2.12) in its "standard 53 
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form. We can also express it in the equivalent “ isotropic’ ’ form, by introducing 
a new radius variable 


p = } [r - MG + (r 2 - 2 MGr) 112 ] (8.2.13) 


or 


r = p 



2 


Substituting this in Eq. (8.2.12) gives 


dx 2 


(1 - MGj2p) 2 dt2 
(1 + MG/2p ) 2 



{dp 2 + p 2 d9 2 + p 2 sin 2 6 dtp 2 ) 


(8.2.14) 


We can also construct harmonic coordinates 

Xj = R sin 0 cos cp; X 2 = R sin 6 sin <p; X 3 = R cos 0; t 
by using for R a solution of the differential equation (8.1.15), which here becomes 



One convenient solution is 


r dr J 


R ~ r — MG 

The metric is then given by Eq. (8.1.16) : 

*■ , (i^*m *. _ (, + a . _ (i±*m ( x . 

\^1 + MG/R J \ R ) \^1 - MG/R )R* 

(8.2.15) 

with R 2 = X 2 now understood. 

We identified the integration constant M with the mass of the sun by com- 
parison with Newton’s theory. In fact, we can show that M is precisely equal to the 
total energy P° of the sun and its gravitational field. Let us write the standard 
form of the metric in quasi-Minkowskian coordinates, by denning 

x l = r sin 0 cos <p, x 2 = r sin 9 sin (p , x 3 = r cos 0 

Then Eq. (8.2.12) becomes 

dr 2 = j^l — 

Since g is time independent and g i0 vanishes, it follows from (7.6.22) that the 
total momentum P l of the system vanishes, which of course it must do since the 


2MG 

r 


*» - j[i - 


— 1 l r 2 (x • dx) 2 — dx 2 



182 


8 Classic Tests of Einstein’s Theory 


system is static and isotropic. To calculate the total energy, we need the asymptotic 
behavior of the spatial part of the metric ; as r —* oo , 




*U 


2 MG 


n i n j + O 



where n i = x l jr. To calculate the integral (7.6.21) we use the relations 

dr 


dn r __ S u — n t nj 
dx j r 

and find 

dh if 8h u 4MG 

dx l dx J r 2 



so Eq. (7.6.23) gives the total energy of matter and gravitation here as 


P° = M 


(8.2.16) 


The reader may check that the same result would be given by the isotropic or 
harmonic forms of the Schwarzschild solution. Finally, Eq. (7.6.24) gives for the 
total angular momentum here the expected value zero. 


3 Other Metrics 

The general kinematic framework provided by the Principle of Equivalence 
rests on a much firmer foundation than do Einstein’s field equations. Indeed, in 
Chapters 3 through 5 we were led almost inevitably from the equality of gravita- 
tional and inertial mass to the full formalism of tensor analysis and general 
covariance, whereas in contrast the derivation of Einstein’s equations in Chapter 7 
contained a strong element of guesswork, and in any case there might exist a long- 
range scalar field, like that of Brans and Dicke, that would alter the field equations. 
It is therefore very useful to test general relativity by assuming that the usual 
rules for the motion of particles and photons in a given metric field g^ v still apply, 
but that the metric may be different from that calculated from the Einstein 
equations. 

In any case we would expect the metric produced by a static spherically 
symmetric body like the sun to be expressible in the “standard,” “isotropic,” 
and “harmonic” forms given in Section 8.1, and we would further expect that the 
metric coefficients [e.g., A(r) and i?(r)] could be expanded as power series in the 
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small parameter MOjr. Such an expansion was given by Eddington and Robertson 1 
for the metric in its isotropic form : 


dr‘ 


-(* 


n MG aB M 2 G 2 
2a + 20 — — + 


...) 


dt‘ 


(. 0 MO \ 


{dp 2 + p 2 d6 2 + p 2 sin 2 9 d(p 2 ) (8.3.1) 


where a, /?, and y are unknown dimensionless parameters. (The reason for carrying 
this expansion to order M 2 G 2 jp 2 in g 00 and only to order MG\p in g ij is that in 
applications to celestial mechanics g u will always get multiplied with an extra 
factor v 2 ~ MGjp.) Comparing with the isotropic form (8.2.14) of the Schwarzs- 
child solution, we see that the predictions of the Einstein field equations can be 
neatly summarized as 

a = p = y = 1 (8.3.2) 


In contrast, the Brans-Dicke theory discussed in Section 7.3 gives a metric (see 
Section 9.9) that can be expressed as in (8.3.1), with 


a = /? = 1, 


w + 1 
w + 2 


(8.3.3) 


where co is the unknown dimensionless parameter of this theory. In order to decide 
whether Einstein, or Brans and Dicke, or someone else, has the right field equations, 
what must be done is to measure a, /?, and y. 

We shall generally be doing our calculations with the metric in its “standard” 
form, so it will prove convenient to convert the Robertson expansion (8.3.1) to 
this form by defining 


or 


r = p 


p = r 



A simple calculation gives 


(8.3.4) 


dr' 


-( 


, „ MG , M 2 G 2 

1 - 2a + 2(0 - ay) — r— + 

r r 


) 


dt : 


/ MG \ 


1 + 2y + * * * j dr 2 — r 2 d6 2 — r 2 sin 2 9 dip 2 
r 


(8.3.5) 
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Also, we can construct harmonic coordinates X, t by using for X 

X x — R sin 0 cos (p, X 2 = R sin 6 sin (p, X 3 = R cos 9 
with R satisfying the differential equation (8.1.15): 


A d 2 ( _ . .MG 

0 = — r 2 1 - (a + y) — - + 
dr V r 


■•)?-*( 


„ , , , , MG 

2 ( 1 - (a - y) + 

r 


The solution is 


R 


■( 


J + (« - 3y)^e 

2 r 


-) 


i? 


(8.3.6) 


and (8.1.16) gives the metric (with R 2 = X 2 ): 


dr 2 = ("l - 2a — + (ay - a 2 + 2p) + • • •] dt 2 

R R 2 I 

_|- 1 + ( 3 y -^ + ,., j dx2 
[(a - y )MGjB + • • - ](X • dX) 1 


Comparing (8.3.5) and (8.3.7) with the corresponding exact solutions (8.2.12) 
and (8.2.15) shows again that Einstein’s theory gives a = /? — y = 1. 

The prediction that a = 1 really just follows from the empirical definition of 
the mass M. Note that Eq. (8.3.1) would give a slowly moving particle far from 
the origin a centripetal acceleration equal to 

— g = — Ty f = A ' J ^ n ~ — 'tXXX (f or MGjr 1 and v 2 1) 

2 dr r 2 


whereas in fact the masses of the sun and planets are measured by setting g = 
MGjr 2 ; hence we must absorb a into M or, in other words, we must choose a = 1. 
Only if it were possible to determine M by some independent nongravitational 
measurement would it make sense to ask whether in fact a is exactly unity. 

With a = 1, the metric functions given by (8.3.5) are 


B(r) = 1 


2MO 

r 


+ m - y) 


M 2 G 2 


+ • ■ • 


(8.3.8) 


A(r) = 1 + 2y — + 
r 


(8.3.9) 


As shown in Chapter 3, the gravitational red shift experiment only measures the 
term —2 MGjr in B{r), and hence can only verify the Principle of Equivalence. 
We shall see that, of the other tests of general relativity listed at the beginning of 
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this chapter, (B) and (D) can only test whether y ~ 1, whereas (C), the precession 
of perihelia, verifies that 2y — ~ 1. (To the extent that we ignore the rotation 

of the earth, (E) also only tests whether y ~ 1.) 


4 General Equations of Motion 


We now consider the motion of a freely falling material particle or photon 
in a static isotropic gravitational field. First let ns consider the most general such 
metric in the standard form derived in Section 1, that is, 


dr 2 = B(r) dt 2 — A(r) dr 2 — r 2 d6 2 — r 2 sin 2 0 d(p 2 
The equations of free fall are 

d — . r" — — = n 

dp 2 vX dp dp 


( 8 . 4 . 1 ) 


(8.4.2) 


where p is a parameter describing the trajectory. In general dr is proportional to 
dp, so for a material particle we could normalize p so that p = r. However, for a 
photon the proportionality constant dr/dp vanishes, and since we wish to treat 
photons as well as massive particles, we shall find it convenient to reserve the right 
to fix the normalization of p independently from that of r. 

Using the nonvanishing components of the affine connection given by Eq. 
(8.1.11), we find from (8.4.2) that 


0 


0 

0 

0 


d 2 r 

i 

A'(r) / dr V 

dp 2 


2 A(r) \dp) 

d 2 e 


2 dO dr 

dp 2 


r dp dp 

d 2 cp 

+ 

2 dcp dr 

dp 2 


r dp dp 

d 2 t 

i 

B'(r) dt dr 

dp 2 

i 

B(r) dp dp 


A{r) \dp 


sin 0 cos 0 


J 

( dq>\ 
\dp J 


dtp d6 
dp dp 


sin 2 0 / d(p\ 2 B'(r) f dt\ 2 
A(r) \dpj 2A(r)\dpJ 

(8.4.3) 

(8.4.4) 

(8.4.5) 


(8.4.6) 


(A prime denotes d/dr.) We solve these equations by looking for constants of the 
motion. 

Since the field is isotropic, we may consider the orbit of our particle to be 
confined to the equatorial plane, that is, 


0 = 


71 

2 


(8.4.7) 
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Then (8.4.4) is immediately satisfied, and we can forget about 6 as a dynamical 
variable. Dividing (8.4.5) and (8.4.6) by dp/dp and dt/dp, respectively, we next find 


— {in — + In r 2 J. = 0 
dp [ dp 

— Jin — + In B = 0 

dp ! dp 


(8.4.8) 

(8.4.9) 


This yields two constants of the motion. One of them will be absorbed immediately 
into the definition of p; we choose to normalize p so that the solution of (8.4.9) is 


dt _ 1 

dp B(r ) 


(8.4.10) 


Since B{r) is close to unity, p is nearly equal to the coordinate time t. The other 
constant is obtained from (8.4.8), and plays the role of an angular momentum 
per unit mass 

.2 d <P 


r~ — = J (constant) 
dp 


(8.4.11) 


Inserting (8.4.7), (8.4.10), and (8.4.11) in (8.4.3) gives the remaining equation of 
motion as 


o = ^ (±v _ jl_ + &(*) 

dp 2 2 A(r)\dp) r 3 A(r) 2A(r)B 2 (r) 
By multiplying this equation with 2.4 (r) dr jdp, we may write it as 

lUvfcX + C-A - 1 = 0 


(8.4.12) 


dp l \dp) ' r 2 B(r ) j 
and our last constant of the motion is therefore 

1 


A(r) 


Td r\ 

\dpj 


2 J 2 

+ -T- 


B(r) 


= — E (constant) 


(8.4.13) 


The proper time x may now be determined from (8.4.1), (8.4.7), (8.4.10), 
(8.4.11), and (8.4.13); we find that 


dx 2 — E dp : 


(8.4.14) 


in accordance with our earlier remark that (8.4.2) forces dxjdp to be constant. We 
see that E must take the values 


E > 0 for material particles 

E — 0 for photons 


(8.4.15) 

(8.4.16) 
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Also, A{r) is in practice always positive, so (8.4.13) tells us that our particle can 
reach radius r only if 


+ E < 


B(r) 


(8.4.17) 


The parameter p may be eliminated everywhere by using (8.4.10) in (8.4.11), 
(8.4.13), and (8.4.14); we then have 


r 2 ^ = JB(r) 
dt 


A(r) fdr\ 2 + J 2 


= -E 


B 2 (r) \dt J 
dr‘ 


B(r) 

EB 2 (r ) dt 2 


(8.4.18) 


(8.4.19) 


(8.4.20) 


For a slowly moving particle in a weak field J 2 /r 2 , ( dr/dt ) 2 , A — 1, and B — 1 ~ 
2 (j) will all be small, and to first order in these quantities the above equations of 
motion become 

r 2 

dt 


~ J 


1 fdr\ 2 J 2 , 1 - E 

2 \dt) 2r 2 2 

These are the same equations as would hold in Newton’s theory, with (1 — E)l 2 
playing the role of an energy per unit mass. 

To see how the exact equations of motion work in a simple case, consider a 
particle in a circular orbit at radius R. Since dr/dt vanishes, Eq. (8.4.19) gives 


J 2 

E 2 


— — + E = 0 
B(R) 


(8.4.21) 


Also, for equilibrium at this radius, the derivative at R of the left-hand side must 
also vanish, so 


2 J 2 B'{R) 
HP + B 2 (R) 


(8.4.22) 


(If we regard a circle as the limit of an ellipse with perihelia R — S and aphelia 
R + S, then (8.4.19) shows that J 2 jr 2 — 1 /B(r) + E must vanish at r = R ± 3, 
and this gives (8.4.21) and (8.4.22) in the limit c) 0.) From (8.4.21) and (8.4.22) 
we find 


J_ A _ RB'{R) 

B(R) \ 2 B(R) 

J2 = B'(R)R 2 
2 B 2 {R) 


(8.4.23) 

(8.4.24) 
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Using (8.4.24) in (8.4.18) gives the rate of revolution as 

dtp _ f B'{R) V 12 
dt ~ \ 2R ) 


whereas (8.4.23) and (8.4.20) give the proper time as 


dr 

dt 


= Jb(R) - IRB'(R) 


(8.4.25) 


(8.4.26) 


By using the Robertson expansion (8.3.8), we find 


dtp 

dt 



(8.4.27) 

(8.4.28) 


In most applications of general relativity we are more interested in the shape 
of orbits, that is, in r as a function of (p, than in their time history. The orbit shape 
can be obtained directly by eliminating dp from (8.4.11) and (8.4.13); this gives 


A{r) ( dr\ 2 1 1 

"r*" WeJ + ^ ~~ J 2 B(r) 

The solution may then be determined by a quadrature : 

A 1/2 (r) dr 

~ J r 2 /I E_ _ U 

r \J 2 B{r) J 2 r 2 y 


E_ 

J 2 


1/2 


(8.4.29) 


(8.4.30) 


5 Unbound Orbits : Deflection of Light by the Sun 

Consider a particle or photon approaching the sun from very great distances. 
(See Figure 8.1.) At infinity the metric becomes Minkowskian, that is, A(oo) = 
B( oo ) = 1, and we expect motion on a straight line at constant velocity V, that is, 

b ~ r sin (q> - (pj ~ r(cp cpj 

— V ~ — (r cos l(p — ( 0 ^)) ca — 
dt ^ ^ dt 

where b is the “impact parameter” and (p ^ is the incident direction. Inserting 
these in (8.4.18) and (8.4.19), we see that they do satisfy the equations of motion 
at infinity, where A = B — 1, and that the constants of the motion are 

J = bV 2 
E = 1 - V 2 


(8.5.1) 

(8.5.2) 
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Figure 8.1 Quantities referred to in the calculation of the deflection of light by the 
sun. (Deflection greatly exaggerated.) 


(Of course a photon has V = 1, and as we have already seen, this gives E = 0.) 
It is often more convenient to express J in terms of the distance r 0 of closest 
approach to the sun, rather than the impact parameter b . At r 0 , drjd(p vanishes, 
so (8.4.29) and (8.5.2) give 

J = r °fc- 1 + F2 ) <85 - 3) 

The orbit is then described by (8.4.30), that is, 


<p( r ) = <?>» + 


r 


A 1/2 (r) dr 


Vo 


L B(r) 


r i i - 1 

- 1 1 + V 2 

J W) J 


IY /2 

r 1 ) 

(8.5.4) 


The total change in <p as r decreases from infinity to its minimum value r 0 and 
then increases again to infinity is just twice its change from oo to r 0 , that is, 
2 1 (p(r 0 ) — cp 0 0 |. If the trajectory were a straight line, this would equal just n ; 
hence the deflection of the orbit from a straight line is 


Atp = 2| tp(r 0 ) - (pj - 7i 


(8.5.5) 


If this is positive, then the angle (p changes by more than 180°, that is, the trajectory 
is bent toward the sun; if A cp is negative then the trajectory is bent away from the 
sun. 

For a photon V 2 = 1, and (8.5.4) gives 


<P( r ) - <P 00 



(8.5.6) 


If we used the exact values of A{r) and B(r) given by the Schwarzschild solution 
(8.2.10), (8.2.11), then we would obtain (p(r) and A <p as elliptic integrals of the 
usual sort, which could only be evaluated numerically by expanding in the <mall 
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parameters MGjr 0 and MGfr. It is both easier and more instructive to expand 
before integrating, using for A(r) and B(r) the Robertson expansions (8.3.8) and 
(8.3.9): 


A(r) = 1 


+ 2y 


MG 

r 


+ • • • 


B(r) = 1 



+ • • ■ 


The argument of the second square root in (8.5.6) is then 


M 2 (*™\ _ ! = ( l ) 2 r 

\V \S(r)J \r 0 J _ 




_\V J L r 0 (r + r„) J 


so (8.5.6) gives 
<p(r) - ip. 


f°° dr 

'J. TTIyTTFl 


The integral is elementary, and gives 


l + I**® + MGr 


r r Q {r + r, 


0) 


*-(?)■ - "(■ -(?)'- J^) - - 


Hence to first order in MGIr 0 , the deflection (8.5.5) is 

4 MG (\ + >- N 


Acp = 


'0 V 2 


! ) 


(8.5.7) 


(8.5.8) 


(To this order, we could just as well replace r 0 here with the impact parameter b.) 

For a light ray deflected by the sun we must use M = M Q = 1.97 x 10 3 3 g, 
that is, MG = M Q G = 1.475 km, and the minimum value of r 0 is R 0 = 6.95 x 
10 5 km, so (8.5.8) gives here 


where 



(8.5.9) 


(8.5.10) 


Furthermore, general relativity gives y = 1, so it predicts a deflection toward the 
sun, with 0 o = 1.75". (For light just grazing Jupiter the deflection is only 0.02". 
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so there seems little hope of observing the deflection of light by any other body 
than the sun.) In the Brans-Dicke theory (8.5.10) and (8.3.3) give a deflection 
constant 


4M Q G / 2q) + 3\ 
R 0 \2m + 4/ 


(8.5.11) 


Whenever we obtain a prediction from general relativity the question always 
arises (or should arise) whether the result obtained really refers to an objective 
physical measurement or whether it has folded into it arbitrary subjective elements 
dependent on our choice of coordinate system. In the case at hand, we should ask 
ourselves what the predicted change in (p really has to do with the positions of 
stellar images on photographic plates. Fortunately, the answer is here quite 
simple, for this is really a scattering experiment. The light ray comes in from a very 
great distance, is deflected as it passes close to the sun, and is detected on earth, 
more than 200 solar radii away from the sun. At the points of origin and detection 
the metric is sensibly Minkowskian, and at these distances there is no question 
about the meaning of ; it is the azimuthal angle in a system of coordinates 
within which light rays define lines that are essentially straight. Hence we can relate 
A (p to the shift of stellar images on photographic plates by the ordinary rules of 
geometric optics. (We are here neglecting effects of the gravitational field of the 
earth itself, because this field is on the earth’s surface more than 10 3 times weaker 
than that of the sun on the sun’s surface.) We would have to be a good deal more 
careful about the operational significance of our (p if we had to predict the deflection 
of light by the sun as seen from an observatory deep within the sun’s gravitational 
field, as, for instance, from an orbiting satellite a few solar radii away from the sun. 

Another conceptual difficulty that may arise here has to do with our treatment 
of the photon as a quantum of light moving as would any other particle that 
happened to have velocity close to unity, that is, to c. Actually, no use is being 
made of quantum mechanics. The wavelength of light is so small compared with 
the scale of the solar gravitational field (i.e., 10“ 5 cm as compared with 10 10 cm) 
that at any point in this field we can erect a locally inertial coordinate system that 
covers a huge (say, 10 15 ) number of wavelengths. The Principle of Equivalence 
tells us that in such a coordinate system light behaves as it does in gravitation- 
free empty space, and since the wavelength is so small, this means that diffraction 
is negligible and each element of a wave front moves in a straight line at unit 
velocity. This statement, when rewritten in the noninertial coordinate systems of 
astronomy, is nothing but our equation of motion (8.4.2). (This argument, in- 
cidentally, shows why the deflection of light cannot depend on its polarization.) 

Now let us see how Einstein’s prediction (8.5.9) compares with observation. 
The deflection angle A q> is classically measured by comparing the apparent positions 
of stars that happen to lie near the solar disk during an eclipse, when their light 
comes close to the sun and yet may be detected, with their positions at night six 
months earlier, when these stars lie on opposite sides of the earth from the sun, 
and their light does not pass close to the sun on its way to us. Subtracting (p 
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(six months earlier) from cp (eclipse) then, in principle, should give A (p. However, 
there is an unavoidable change in the scale of the photographs over a six-month 
interval, owing partly to small changes in the temperature and in the mechanical 
configuration of the telescope and camera over so long a time. A change in the 
scale of the photograph would give an apparent deflection of any star toward or 
away from the sun by an angle proportional to the distance r 0 at which its light 
passes the sun ; hence what is done in practice is to compare observations with a 
theoretical curve 

a = e 0 (^ + s 

where S is the unknown scale constant (often called a) and 0 is an angle to be 
compared with the theoretical value 1.75". There are other effects that could 
contribute to A cp, such as refraction of the starlight in the solar corona or as it 
enters the colder air in the moon’s shadow, but none of these is believed to play an 
important role. 

Observations cannot be carried closer to the sun’s disk than r 0 « 2R Q , but 
they can still be used to determine 0 O by fitting the observed A (p values to the 
theoretical curve (8.5.12). The difficulty with this program is just that A (p is very 
difficult to measure accurately in the brief time available during an eclipse. In 
1919 eclipse expeditions were sent to two small islands, Sobral, off the northeast 
coast of Brazil, and Principe, in the Gulf of Guinea. About a dozen stars in all were 
studied, and yielded values 2 1.98 ± 0.12" and 1.61 + 0.31", in substantial 
agreement with Einstein’s prediction 0 O = 1.75". It was perhaps this dramatic 
result more than any other success that brought general relativity to the attention 
of the general public in the 1920’s. 

Since 1919 there have been measurements on about 380 stars observed during 
the eclipses of 1922, 1929, 1936, 1947, and 1952, which we summarize in Table 
8.1 (taken from the summary of von Kluber 3 ). The values obtained for 0 O vary 
from 1.3" to 2.7", but mostly lie between 1.7 and 2". The most recent of these 
results is A (p = 1.70 + 0.10", in very good agreement with Einstein’s prediction, 
but it is not clear that the systematic error here is really smaller than for previous 
observations. From all this we can conclude that there definitely is a deflection 
of light greater than the value 0 o = 0.875" that would be predicted for y = 0 
(i.e., A(r) = 1), but as to its precise value we can say little more than that 6 C 
is somewhere between 1.6 and 2.2"; that is, y is between about 0.9 and 1.3. It may 
become possible to improve the accuracy of this determination in the near future 
by using photoelectric techniques to monitor star positions without waiting for an 
eclipse. 

Recent developments in radio astronomy 4 have made it possible to measure 
the deflection of radio signals by the sun with potentially far greater accuracy than 
is possible in optical astronomy. The angular accuracy of optical observations is 
limited by inhomogeneities in the earth’s atmosphere to about 0.1", whereas a 
radio interferometer with wavelength A and baseline D can in principle measure 


fjo\ 
V®o / 


(8.5.12) 
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Table 8.1. Measurements of the Deflection of Light by the Sun. 3 The fourth 
column gives the minimum and maximum values for the distance of closest ap- 
proach of the light ray to the sun’s center for the various stars studied. The fifth 
column gives the deduced value for the deflection of a light ray that just grazes 
the sun’s surface. 


Eclipse 

Site 

Number of 
Stars 

r 0 IRo 

6> 0 (sec) 

Ref. 

May 29, 1919 

Sobral 

7 

2-6 

1.98 ± 0.16 

a 


Principe 

5 

2-6 

1.61 ± 0.40 

a 

September 21, 1922 

Australia 

11-14 

2-10 

1.77 ± 0.40 

b 


Australia 

18 

2-10 

1.42 to 2.16 

c 


Australia 

62-85 

2.1-14.5 

1.72 ± 0.15 

d 


Australia 

145 

2.1-42 

1.82 ± 0.20 

e 

May 9, 1929 

Sumatra 

17-18 

1.5-7. 5 

2.24 ± 0.10 

f 

June 19, 1936 

IT.S.S.R. 

16-29 

2-7.2 

2.73 ± 0.31 

g 


Japan 

8 

4-7 

1.28 to 2.13 

h 

May 20, 1947 

Brazil 

51 

3.3-10.2 

2.01 ± 0.27 

i 

February 25, 1952 

Sudan 

9-11 

2. 1-8.6 

1.70 ± 0.10 

j 


a F. W. Dyson, A. S. Eddington, and C. Davidson, Phil. Trans. Roy. Soc., 220A, 291 (1920); 
Mem. Roy. Astron. Soc., 62, 291 (1920). 

b G. F. Dodwell and C. R. Davidson, Mon. Nat. Roy. Astron. Soc., 84, 150 (1924). 
c C. A. Chant and R. K. Young, Publ. Dominion Astron. Obs., 2, 275 (1924). 

d W. W. Campbell and R. Trumpler, Lick Observ. Bull., 11, 41 (1923); Publ. Astron. Soc. Pacific, 
35, 158 (1923). 

e W. W. Campbell and R. Trumpler, Lick Observ. Bull., 13, 130 (1928). 

f E. F. Freundlich, H. v. Kluber, and A. v. Brunn, Ab. Preuss. Akad. Wiss., No. 1, 1931; Z. 
Astrophys., 3, 171 (1931). 

S A. A. Mikhailov, C. R. Acad. Sci. USSR (N. S.), 29, 189 (1940). 

h T. Matukuma, A. Onuki, S. Yosida, and Y. Iwana, Jap. J. Astron. and Geophys., 18, 51 (1940). 

1 G. van Biesbroeck, Astron. J., 55, 49, 247 (1949). 
j G. van Biesbroeck, Astron. J., 58, 87 (1953). 


angles with an accuracy of order A/ 27 rD radians ; this is 0.1" for A = 3 cm and D = 
10 km, and proportionately less for longer baselines. 

One complication that bothers the astronomer more at radio than at optical 
frequencies is the refraction of rays in the solar corona. At X-band frequencies 
(8000-12500 MHz) the refraction is quite small, and can be eliminated by excluding 
data taken when the radio signal passes within about 2 solar radii of the sun’s 
surface. However, at S- band frequencies (2000-4000 MHz) it is necessary to 
analyze the data in terms of a model, in which part of the deflection arises from 
general relativity, and the rest is produced by the corona. The parameters describ- 
ing the solar corona can in principle be measured by this technique (using several 
frequencies) at the same time as the general- relativistic parameter, but the electron 
densities in the corona change with time, and it appears that the only really 
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satisfactory method for dealing with coronal refraction is to use a radio frequency 
in the X-band, or above. 

Each October, the quasi-stellar source 3C279 is occulted by the sun, and a 
number of radio astronomy groups have taken this opportunity to observe the 
changes in the angle (about 9.5°) between 3C279 and the quasi-stellar source 
3C273, during the period just before and just after occultation. The results are 
given in Table 8.2. Again, we see general relativity confirmed, but not yet well 
enough to distinguish between the theories of Einstein and Brans and Dicke. 
However, the data taken with very long baselines, such as the “Golds tack” 
baseline of 3900 km, contain enough information in principle to measure angular 
positions to about 0.001". It is hoped that the analysis of this data will eventually 
lead to a really precise determination of y. 

Table 8.2. Interferometric measurements of the deflection of radio waves 
from the source 3C279 by the sun. The data are analyzed in terms of the deflection 
0 O that would be produced by a radio signal just grazing the sun. 



Radar 

Frequency 

Baseline 

Period 


0o 


Facility 

(MHz) 

(km) 


( 

sec) 

Ref. 








Owens Valley 

9602 

1.0662 

9/30-10/15, 

1969 

1.77 

± 0.20 

a 

Goldstone 

Goldstone/ 

2388 

21.566 

10/2-10/10, 

1969 

1.82 

+ 0.24 
- 0.17 

b 

Haystack 

7840 

3899.92 

9/30-10/15, 

1969 

1.80 

± 0.2 

c 

NRAO 

2695 
& 8085 

2.7 

10/2-10/12, 

1970 

1.57 

± 0.08 

d 


2697 
& 4993.8 

1.41 

10/8, 

1970 

1.87 

+ 0.3 

e 


a G. A. Seielstad, R. A. Sramek, and K. W. Weller, Phys. Rev. Letters, 24, 1373 (1970). 
b D. O. Muhleman, R. D. Ekers, and E. B. Fomalont, Phys. Rev. Letters, 24, 1377 (1970). 
c I. I. Shapiro, private communication, 
d R. A. Sramek, Ap. J., 167, L55 (1971). 
e J. M. Hill, Mon. Not. Roy. Astron. Soc., 153, 7P (1971). 


6 Bound Orbits : Precession of Perihelia 

Now consider a test particle bound in an orbit around the sun. (See Figure 
8.2.) At perihelia and aphelia, r reaches its minimum and maximum values 
and r + , and at both points drjdcp vanishes, so (8.4.29) gives 

1 1 _ E 

~ ~ J 2 
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Figure 8.2 Elements of an ellipse referred to in the calculation of the precession of 
planetary orbits. (The ellipse here has the same eccentricity as the orbit of Icarus.) 


From these two equations we can derive values for the two constants of the motion : 

r + 2 r_ 2 

E = (8.6.1) 


J 2 = 


»- + 2 

r 2 

B(r + ) 

B(r_) 

r + 2 

- r _ 2 

1 

1 

B(r + ) 

B(r_) 


1 


1 


(8.6.2) 


The angle swept out by the position vector as r increases from r _ is given by 
Eq. (8.4.30) as 


cp{r) = (p{r_) + 


A 1/2 (r) 



or, using (8.6.1) and (8.6.2), 


2 ? 

J 2 


r 2 r 2 


(p{r) - cp{r_) 



2 (B~ 1 (r) - B~ *(>•_)) - r + 2 (B~ 1 (r) - B~ l (r + )) 
r + 2 r_ 2 (B- l (r + ) - B-^r.)) 


x A 1/2 (r)r 2 dr 



(8.6.3) 


The change in (p as r decreases from r + to r_ is the same as the change in ip as r 
increases from to r + , so the total change in cp per revolution is 2| (p(r + ) — cp(r^) | . 
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This would equal 2n if the orbit was a closed ellipse, so in general the orbit precesses 
in each revolution by an angle 

A (p = 2| (p{r + ) — (p(r_) | — 2n (8.6.4) 

Using the exact values of A(r) and B(r) given by the Schwarzschild solution 
(8.2.10), (8.2.11) in (8.6.3) would yield formulas for (p{r) and A (p as elliptic integrals, 
and to evaluate them numerically we would have to expand in MG/r and MG/r ± . 
Instead we shall expand in the integrand, using for A(r) and B{r) the Robertson 
expansions (8.3.8), (8.3.9): 

A(r) = 1 + 2y — + • • • (8.6.5) 

r 

at,.)-! 2MG , w- y)M lG2 , ... 

r r 2 


Note that there is a complete cancellation in (8.6.3) of the leading term in B(r) 
but not of that in A(r), so to calculate cp and A cp to first order in Mglr+ we need 
B(r) to second order in MG/r, whereas A{r) will be needed only to first order. 

It saves a great deal of work if we realize that by using the expansion 

D „i, x . 2MG 2(2 - P + y)M 2 G 2 

B J (r) ~ 1 -\ h y n 

r r 2 


we make the argument of the first square root in (8.6.3) a quadratic function of 
1/r. Furthermore, it vanishes at r = r ± , so 


rJ(B~ l (r) - B~\r_)) - r + 2 (B~ l (r) - B~\r + )) 
r + 2 r_ 2 (B~ l (r + ) - B~ l (r_)) 



The constant G can be determined by letting r -» oo : 

c = r + 2 ( 1 - B ~ 1 ( r +)) - r - 2 ( 1 - B-'(r_)) 
r+r_{B- *(r + ) - B~\r_)) 


( 8 . 6 . 6 ) 


or factoring out of numerator and denominator a common factor 2 (r_ — r + )MG: 
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Using (8.6.5)-(8.6.7) in (8.6.3) gives then 

<p(r) - q>(r_) ~\l + $(2- 0 + y)MQ (— + — 


[l + i(2 - 0 + y)MQ (4- + 

x . r/ 1 i\ /i i \~ii / 2 


MCHe-ar 

The integral is made trivial by introducing a new variable i/j : 


1 1/1 1 \ 1/1 1 \ . 

_ = _/ 1 - ] + - [ sm f 

r 2 \ r , r_ 2 Vr A r_ 


( 8 . 6 . 8 ) 


We then find 


q>(r) - q>(r_) = 1 + i(2 - 0 + 2 y)MQ ( — + -)1 L + - 

L \2*+ r-/J L 2_ 


iyMG ( — - — ) cos ^ 


(8.6.9) 


At aphelion i]/ = tt/2, so (8.6.4) and (8.6.9) give the precession per revolution as 

A (p = (radians/revolution) (8.6.10) 

where L is the dimension of the ellipse called the semilatus rectum 


L 2 y + r_J 

The elements of planetary orbits usually found in tables are the semimajor axis a 
and eccentricity e, defined by 

r ± = (1 + e)a 

Hence we can determine L from a and e by using the formula 

L = (1 — e 2 )a 

Einstein’s field equations yield /? = y — 1 , so they predict a precession 


A (p = 6n —jr radians/re volution 


(8.6.11 


This is positive, meaning that the whole orbit should precess in the same direction 
as the motion of the test particle. In the Brans-Dicke theory (8.6.10) and (8.3.3) 


A <p = 


Qn MG\ fSco + 4 


L M3m + 6 


( 8 . 6 . 12 ) 
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Once again, we should ask ourselves what the predicted value of A (p means. 
This is not a scattering experiment like the deflection of light by the sun ; here we 
are dealing with an object that never gets out to infinity where the metric is 
Minkowskian. Any observation of the test particle’s motion by optical or radar 
astronomers will make use of light rays that are themselves affected by the 
gravitational field, and if careful corrections are not made for the deflection of 
light, the astronomer’s reported (p(r ) values will, at any given radius r, contain 
errors of the order of MGjL. [See Eq. (8.5.8).] However, in practice these fine 
points do not really matter, because the precession is cumulative. Equation (8.6.10) 
shows that after N revolutions the perihelia will have advanced by an angle of 
order NMGjL , so if A > 1 it is unnecessary to worry about an error in q> of order 
MGjL. Indeed, Eq. (8.6.11) tells us that the perihelion will return to its original 
azimuth after LfiMG P 1 revolutions, a prediction that clearly has nothing to do 
with how we define r or (p. 

For Mercury we must take L = 55.3 x 10 6 km, and of course MG — 
1.475 km, so Eq. (8.6.11) gives A<p = 0.1038" per revolution. Since Mercury makes 
415 revolutions per century, the prediction of general relativity is that 

A (p = 43.03" per century ( £j$) 

Fortunately there are accurate observations of Mercury going back to 1765. 
These data were reanalyzed by Clemence 5 in 1943; he finds A q> = 43.11 + 0.45" 
per century, essentially confirming Newcomb’s earlier value (see Section 1.2), 
and in excellent agreement with general relativity. Taken at face value, this 
agreement shows that the correction factor in Eq. (8.6.11) is 

^ 2 - ^ + 2y ^ = | 00 ± 0 01 

This is by far the most important experimental verification of general relativity, 
both by virtue of its high accuracy, and because it alone is sensitive to the coefficient 
/? appearing in the second-order term in g tt . 


Table 8.3. Comparison of Theoretical and Observed Centennial Precessions of 
Planetary Orbits. 6 



a 


GnMG 

Revolutions 

A <p (seconds/century) 

Planet 

(10 6 km) 

e 

L 

Century 

Gen. Rel. 

Observed 

Mercury 

<?> 

57.91 

0.2056 

0.1038" 

415 

43.03 

43.11 

+ 0.45 

Venus 

( 9 ) 

108.21 

0.0068 

0.058" 

149 

8.6 

8.4 

± 4.8 

Earth 

(®) 

149.60 

0.0167 

0.038" 

100 

3.8 

5.0 

± 1-- 

Icarus 

161.0 

0.827 

0.115" 

89 

10.3 

9.8 

± 0.8 
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Results 0 for Venus, the earth, and Icarus are listed together with those for 
Mercury in Table 8.3. Evidently the accuracy available from the major planets 
degrades rapidly as we move away from the sun, both because the smaller eccen- 
tricities make observation of the perihelia more uncertain, and because as L 
increases, the precession per revolution and the revolutions per century both 
decrease. Icarus was only discovered in 1949, but is in some respects the most 
useful object to study because its small size, its close approach to the earth, and 
the large eccentricity of its orbit allow its precession to be determined with high 
accuracy. Suggestions have been made to put an artificial satellite into an eccentric 
orbit close to the sun ; for instance, a satellite with L = 10f? o would have a centen- 
nial precession equal to 8250"! The trouble here is that a small object would be 
subject to nongravitational perturbations such as radiation pressure, solar wind, 
micrometeorites, which of course have a negligible effect on Mercury and Icarus. 

There are two caveats that should be kept in mind in assessing the agreement 
of the observed advance of perihelia with the prediction of general relativity. 
First, there are many known perturbations that contribute to the precession of 
planetary orbits. In particular, Newtonian theory would give Mercury a precession 

A<p N = 5557.62 ± 0.20" ( 8 ) 

of which about 5025" is due to the rotation of the earth -based astronomical 
coordinate system, and about 532" is due to gravitational perturbations calculated 
by Newtonian perturbation theory from the motion of the other planets, chiefly 
Venus, earth, and Jupiter. The precession actually observed is 

A<Pobs = 5600.73 ± 0.41" ( $) 

and the value A(p = 43.11 + 0.45" quoted above for the “observed” anomalous 
precession is obtained by subtracting the Newtonian precession from what is 
observed, that is, 

A (p = A<pobs - A (p N (8.6.13) 

One may ask how we know that this is the right quantity to compare with the 
general-relativistic result of 43.03" per century. That is, how do we know that the 
total precession is correctly given by adding the Newtonian value A cp N , calculated 
while forgetting all effects of general relativity, to the Einstein value A <p GR , 
calculated forgetting all effects of planetary perturbations ? To some extent this 
question can be answered by noting that the general-relativistic corrections to 
A cp N would be of order MGjL times A (p N , or only about 10“ 4 " per century. For a 
fuller answer we shall have to wait until the discussion in the next chapter of the 
post-Newtonian approximation. But even granting that (8.6.13) is in principle 
correct, we should be aware that a very small systematic error in either A qj N or 
A<Pobs cou ld completely destroy the agreement between theory and observation. 

The second warning is that very small unknown effects may possibly be 
contributing to the observed precession of perihelia an amount comparable with 
that expected from general relativity. Indeed, we saw in Chapter 1 that Newcomb 
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had by 1911 abandoned his earlier suggestion of a small departure from the inverse- 
square law, because the observed anomalous precession of 43" per century could 
also be explained within Newtonian mechanics as due to the gravitational field 
produced by the matter which causes the “zodiacal light.” (Today we know that 
there is not enough matter between Mercury and the sun to produce any appreciable 
precession.) It is also possible that the sun is slightly oblate, 7 in which case its 
Newtonian potential would have an r~ 3 term, giving the planets an anomalous 
precession per revolution decreasing as the inverse square of their distance from 
the sun. Table 8.2 shows that in fact the observed anomalous precession per 
revolution decreases roughly as 1/r, not 1/r 2 , in agreement with the prediction of 
general relativity. Even more important, a large solar oblateness would produce 
an anomalous precession of the planes of the inner planets’ orbits that is not 
observed. 8 These two arguments together rule out the possibility of explaining 
all of the observed anomalous precession as owing to solar oblateness, but this 
explanation might account for up to 20% of the observed effect. To test this hy- 
pothesis Dicke and Goldenberg 9 scanned the solar disk photoelectrically during 
the period June 1 to September 23, 1966. They concluded that the sun’s polar 
diameter is shorter than its equatorial diameter by 5.0 ± 0.7 parts in 10 5 . Taken 
at face value, this would give the perihelia of Mercury an extra precession of 3.4" 
per century, so that only 39.6" per century would be left as the relativistic effect, an 
8% disagreement with Einstein’s prediction of 43.03" per century. The Brans- 
Dicke theory can account for an excess centennial precession of 39.6" if we take 
co = 6.4. However, there are several reasons for hesitating before we give up 
general relativity: 

(A) To account for the solar oblateness the solar interior would have to be 
rotating about once in one or two days, very much faster than the observed 
rotation rate (i.e., once in 25 days) of the sun’s surface. This difference in rotation 
rates could perhaps be explained 1 0 as due to a magnetic torque induced by the 
solar wind, which retards the rotation of the surface, but it is not clear that this 
configuration would be dynamically stable. 1 1 

(B) Two very elaborate series of measurements 12 made with the Gottingen 
heliometer during the period 1891-1902 gave for the difference between the sun’s 
equator and polar diameters the values 0.36 ± 0.78 parts in 10 5 and —0.10 ± 0.47 
parts in 10 5 , respectively, in agreement with each other and with perfect sphericity, 
but in disagreement with the result +5.0 + 0.7 parts in 10 5 of Dicke and Golden- 
berg. The Gottingen results were also supported by subsequent heliometer 
measurements. To quote Ashbrook : 1 2 

“What are we to make of all this ? In view of the astronomical evidence, can the 
sun’s polar diameter be 0.1 seconds shorter than its equatorial, as Drs. Dicke 
and Goldenberg think ? W as there some unsuspected subtle systematic error in 
the Princeton experiment? Or was there some unrecognized effect in all the 
heliometer series?” 
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To do Dicke and Goldenberg justice, it should be noted that the axis of the oblate 
ellipsoid they observe follows the axis of the sun’s rotation in its apparent annual 
oscillation, which certainly suggests that they are seeing something real. 

(C) Even if the visible solar surface is oblate, what does this really tell us 
about the shape of its mass distribution and about the sun’s gravitational field? 
Dicke 7 argues that the observed solar surface coincides with the gravitational 
equipotential surface, but this conclusion rests on a good deal of astrophysical 
theory and could be wrong. 

(D) Finally, if Dicke and Goldenberg are right, then the 1% agreement 
between Einstein’s prediction and the observed anomalous precession is a mere 
coincidence. 


7 Radar Echo Delay 

The classic tests of general relativity discussed in previous sections deal only 
with the shape of the trajectories of photons and planets. In recent years the 
development of high-speed electronics and high-power radar has opened to us the 
possibility of measuring motion as a function of time with the accuracy needed 
to test Einstein’s equations. In particular, I. I. Shapiro has proposed 13 and, 
together with a group at the Lincoln Laboratory, has carried out measure- 
ments 14,1 5 of the time required for radar signals to travel to the inner planets and 
be reflected back to earth. 

As a first step toward understanding the significance of these measurements, 
let us calculate the time required for a radar signal to go from one point with 
r = r 1 ,9 = 7t/2, (p = (p t , to a second point r = r 2 ,0 = 7t/2, (p = (p 2 . The equation 
governing the time history of orbits is Eq. (8.4.19) : 

Mr) fdr \ 2 + y_J_ = _ E 

B 2 {r)\dtJ r 2 B(r ) 

We are dealing here with a light ray, so E = 0. Furthermore, ( drjdt ) 2 must vanish 
at the distance r = r 0 of closest approach to the sun, so 

j2 = r Q 2 

B{r 0 ) 

The equation of motion of a photon is then 



(8.7.1) 
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I’rom (8.7.1) we see that the time required for light to go from r 0 to r or from r 
to r 0 is 


£(**, r 0 ) 



A( r )/B(r ) 



1/2 


dr 


(8.7.2) 


and of course the total time required for light to go from point 1 to point 2 is 
(for \q> l - q > 2 1 > tt/2) 

t l2 = t{r v r 0 ) + t(r 2 , r 0 ) (8.7.3) 


In order to evaluate the integral (8.7.2) we once again use in the integrand 
the Robertson expansion of Section 3 : 


A{r) ~ 1 + 


zylvlG 


B(r) ~ 1 


2MG 


We then have 


m 

B(r 0 ) 






1 - 


V\ r " 1 _ 2 MGr 0 ~ 
r 2 ) [_ r(r + r 0 )_ 


so (8.7.2) gives, to first order in MGjr and MG/r 0 

■ i 2 


*o) 




Y (1 + y)MG MGr t 


r(r + r { 


o 

r o)_ 


dr 


The integral is now elementary, and we find that the time required for light to go 
from r 0 to r is 

t(r, r 0 ) * A 2 - r 0 2 + (1 + y)MG In ( - + ^ ~ ^ 0 ~- 

\ r o 

+ MG ( r -^lA U2 ( 8 . 7 . 4 ) 

V + V 

The leading term \jr 2 — r 0 2 is what we should expect if light traveled in straight 
lines at unit velocity. The other terms evidently produce a general-relativistic 
delay in the time it takes a radar signal to travel to Mercury and back. (Note that 
a delay is opposite to what would have been expected from our experience with 
slowly moving bodies like comets.) This excess delay is a maximum when Mercury 
is at superior conjunction and the radar signal just grazes the sun; in this case 
r 0 is about equal to the radius of the sun, r 0 ~ R 0 , and is much smaller than the 
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distances r 0 and of the earth and Mercury from the sun, so the maximum 
round-trip excess time delay is given by (8.7.3) and (8.7.4) as 


(At) m „ = 2 [<(r e> R 0 ) + <(r ? , R 0 ) - sjr - R Q 2 ~ s/r « 2 - R 0 2 ] 


4 M n G U + I 1^) In ( 

R ( 


5.9 km 1 + 11.2 


1 + y 


(8.7.5) 


If Einstein’s field equations are correct, then y — 1, and the maximum excess 
time delay will be 

(AOmax — 72 km — 240 psec (8.7.6) 


There is no difficulty in telling time to within a microsecond over the time interval 
of order 20 min during which the radar signal goes to Mercury and returns. 
Nevertheless this experiment presents extraordinary difficulties in execution and 
interpretation. 

One trouble is that the radar signal is not reflected from just one “specular 
point” on Mercury’s surface, but rather comes from a good-sized area, and is 
therefore spread out in arrival time by several hundred microseconds. Shapiro’s 
group deals with this problem by a technique called “delay-Doppler mapping,” 
that is, by measuring the distribution of the return signal power in frequency as 
well as arrival time. Each element of the reflecting surface has a characteristic 
velocity relative to the radar antenna (owing to the rotation and orbital motion 
of both the earth and Mercury) and therefore reflects the radar signal with a 
characteristic Doppler shift in frequency. Thus, if the reflecting properties of the 
surface are known, it is possible to deduce the arrival time of the echo from the 
point on Mercury’s surface closest to the earth by analyzing the observed dis- 
tribution of the echo in arrival time and frequency. (The reflecting properties of 
the surface are determined by studying the echo when Mercury is near inferior 
conjunction, where the signal-to-noise ratio is greatest and general relativity has 
no appreciable effect on the radar travel time.) 

A more fundamental difficulty is that in order to compute an excess time delay 
to within 10 p sec, for instance, we have to know the time that the radar signal 
would have taken in the absence of the sun’s gravitation to that accuracy; that 
is, we have to know the distance 

(V-ro 2 ) 1/2 + (rj 2 ~ro 2 ) 1/2 

to within 1.5 km ! Here r e , r^, and r 0 are the distances (in ‘ ‘standard” coordinates) 
from the center of the sun to the radar antenna on earth, to the point on Mercury’s 
surface closest to the earth, and to the point of closest approach of the radar 
signal to the sun. However, optical astronomy alone certainly does not provide 
us with the locations of the centers of Mercury or the earth, or Mercury’s radius, 
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to anything like the needed accuracy. Indeed, this accuracy is so demanding that it 
is necessary to specify whether one is dealing with standard, isotropic, or harmonic 
coordinates; needless to say, the U.S. Naval Observatory does not usually draw 
such fine distinctions! Shapiro’s group deals with this problem by using general 
relativit}^ theory itself to calculate r 0 (£), ry(t), and r 0 (t) in terms of a large set of 
unknown parameters, including (1, y, M 0 G , the equatorial radius of Mercury, and 
the positions and velocities of Mercury and the earth at some initial time. These 
parameters are then determined by fitting the observed radar travel times to 
Mercury and back with the theoretical formulas (8.7.3) and (8.7.4). 

The first run, using the 7840 MHz Haystack radar at Lincoln Laboratory 
during the superior conjunctions of Mercury of April 28 to May 20, 1967, and 
August 15 to September 10, 1967, gave good agreement between theory and 
observation. 14 To put it quantitatively, if the radar delay times are computed 
using Eqs. (8.7.3) and (8.7.4) with y left arbitrary, then the best fit is found for 
y = 0.8 + 0.4. (For purely technical reasons, /? was taken as unity in the pre- 
liminary analysis.) Further observations at Haystack and improvements in data 
analysis have since improved this result to 1 5 

y = 1.03 ± 0.1 (8.7.7) 

(See Figure 8.3.) In addition, Shapiro has reanalyzed 16 over 400,000 older optical 
observations of the sun, moon, and planets in conjunction with the new radar 



Figure 8.3 Comparison of theory with observation for the time delay of radar echoes 
from Venus. 15 (Courtesy of I. I. Shapiro.) 
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data, and finds that the qUadrupole term in the sun’s gravitational potential has 
the value J 2 = ( — 0.8 + 2.5) x 10“ 5 , with J 2 defined by the Legendre expansion : 


0o = ” 


GMr 


1 - s ( — y p i < cos 0) 
1 = 2 


For comparison, the solar oblateness found by Dicke and Goldenberg corresponds 
to the quadrupole term J 2 = (2.7 + 0.5) x 10“ 5 . If J 2 is constrained to vanish, 
then Shapiro’s analysis gives values for the extra precessions of the perihelia of the 
orbits of Mercury and Mars that are respectively 0.99 ± 0.01 and 1.07 + 0.1 
times the values predicted by general relativity. 

Shapiro has also proposed 17 the measurement of time delays in the arrival 
of radio pulses from pulsars. The pulsar CP0952 approaches to within 5° of the sun, 
and at such times radio pulses would be delayed by about 50 psec. 

Recently a group 18 at the Jet Propulsion Laboratory measured the time 
delays of radar signals, sent from the earth to transponders aboard the artificial 
satellites Mariner 6 and 7 and thence back to earth, during the period March- 
June 1970 when these satellites were near superior conjunction. The best data 
were taken on April 28, 1970, when the radar signals passed within three solar 
radii of the sun’s center. Analysis of these data gives a time delay within 5% of 
that predicted by general relativity. Unfortunately, the radar frequencies used 
were in the o-banu, about 2300 MHz, so the solar corona played a troublesome 
role here. (See Section 8.5.) In addition, the Mariner satellites are small enough to 
be appreciably affected by nongravitational forces, chiefly solar radiation pressure, 
gas leakage, and thrust imbalances in the attitude control system. 

The sensitivity of the radar echo arrival times to fine details of orbital motion 
makes the calculation of ‘ 1 theoretical” arrival times an enormously difficult task, 
which cannot be given the simple analytic treatment appropriate to this book. 
There is, however, some insight to be gained by looking at a simple calculation in 
a situation so highly idealized that we can deal with it here. Consider as a reflector 
a point planet labeled “1” in a circular orbit about the sun of radius r x , and let 
the radar antenna be placed on a planet “2” that lies in the plane of planet l’s 
orbit (0 = n/2), but is so far from the sun that its position can be taken as fixed, 
with r 2 > r x and (f) 2 — 0. (The change in (f) 2 during the radar signal travel time 
vanishes as r 2 -1/2 ). A radar signal emitted from planet 2 at time t will reach 
planet 1 at a time t x given (for |0j| > tt/ 2) by 


h = t + t{r v r 0 ) + t{r 2 , r 0 ) 


or using (8.7.4) with r 2 -* 00 : 


\ x =t + T + (?q 2 - r 0 2 ) l/2 + MG ( — 


+ (1 + y)MG In 


r 1 + r f 


+ + (+ - V ) l/2 > 


1/2 




(8.7.8) 
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where T is a large constant : 


T = r 2 + MG + (1 + y)MG In ^ 


(8.7.9) 


The azimuthal angle of the planet at this time is given by Eq. (8.4.27) : 

<Pi = <P(0) + (8.7.10) 

* (^y (7 _ (8.7.11, 


CO 


Finally, r 0 is calculated by setting cp x equal to the value determined from Eq. 
(8.5.7), 


7r — sin 1 [ — ) + 

r, / \ r f 


l + y + y l 


<P i = M*o) - <p{rx)l + [<p(r Q ) - cp( oo)] 

'MG 

i/ \ / o 
or to first order in MG: 

r 0 ~ r x sin cp x — MG cot cp x Tl + y — y cos cp x + 


2 \ 1/2 


'l ~ ' 0 

r* + r 0 


i/2- 


1 — sin <pf\ 1/2 
1 + sin cp 


\ 1/2-1 

:) ] 


(8.7.12) 


Using (8.7.10)— (8.7.12) in (8.7.8) gives the relation between the times t and t x 
of radar signal emission and reflection : 

t x ~ t + T — a cos (cot l + cp( 0)) — 6{1 — In [1 -f cos (<ot x -f <p(0))]} 

(8.7.13) 

where 

a = r x - yMG (8.7.14) 

h = (1 + y)MG (8.7.15) 

Equation (8.7.13) can be solved for t x (t), and the arrival time of the echo back at 
the antenna can then be determined as 


t 2 (t) = t + 2[fl(fl - fl = 2 t x (t) - t (8.7.16) 

By comparing this theoretical prediction with the observed arrival times of the 
radar echo, we can, in principle, determine the five parameters 

T a b co cp( 0) 

But these five parameters depend on the six unknowns r x , r 2 , MG, y, p, and cp( 0). 
so even if our measurements and Eqs. (8.7.13)-(8.7.16) were perfectly accurate, 
we still could not determine both /? and y. The best we can do is to eliminate r x 
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and MG from the formulas (8.7.11), (8.7.14), and (8.7.15) for co, a, and b, obtaining 
thereby a formula for y : 


1 + y = ba 3 cl> 2 



(8.7.17) 


Note that in this case cannot, even in principle, be determined from radar time 
delay measurements. 19 It would be possible to determine both /? and y from 
observations of radar echoes from two reflecting planets in circular orbits, since 
in this case there are ten observable parameters and only eight unknowns. More 
important, it is possible to measure ft by radar observations of Mercury alone, 
because its orbit is sufficiently eccentric for its precession seriously to affect radar 
arrival times. 


8 The Schwarzschild Singularity* 

The reader will probably have noticed that the Schwarzschild solution (8.2.12) 
becomes singular at r = 2M G. This radius corresponds top = MGj2 and R = MG, 
so we see that this singularity also occurs when the metric is expressed in its 
isotropic form (8.2.14) or in its harmonic form (8.2.15). The radius 2GM at which 
the singularity occurs in standard coordinates is called the Schwarzschild radius 
of the mass M . 

It should immediately be stressed that there is no Schwarzschild singularity 
in the gravitational field of any known object in the universe. The Schwarzschild 
singularity appeared in the solution of Einstein’s vacuum equations = 0, 
and is therefore irrelevant if the radius 2GM lies within the massive body, where 
we must use the full Einstein equation (7.1.13). For the sun the Schwarzschild 
radius 2 GM Q is 2.95 km, deep in the solar interior, and we shall see in Chapter 11 
that the solution of the full Einstein equation inside a stable star exhibits no 
Schwarzschild singularity, or any other singularity. For a proton the Schwarzschild 
radius is 10“ 50 cm, and this is 37 orders of magnitude smaller than the charac- 
teristic proton radius of about 10“ 13 cm! In Chapter 11 we discuss the possibility 
that a very massive body might collapse to a radius smaller than its Schwarzschild 
radius, but with this one hypothetical exception the Schwarzschild singularity 
does not seem to have much relevance to the real world. 

It is nonetheless instructive to imagine a body so small and massive that the 
radius 2GM lies outside it, in empty space. The Schwarzschild solution then holds 
down to this radius and actually displays a singularity. But is this singularity 
real? We can readily calculate the four non vanishing curvature invariants des- 
cribed in Section 6.7, and find that they are all perfectly well behaved at the 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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Schwarzschild radius, although they do become singular at the origin. This suggests 
that the apparent Schwarzschild singularity may be only an artifact of the co- 
ordinate systems we have used. (If any one of the curvature invariants had been 
singular at the Schwarzschild radius, then this singularity would of course have 
been present in all coordinate systems.) Only a few years ago a coordinate system 
was found that allows us to avoid talking about a Schwarzschild singularity, if 
we are willing to allow the world an unusual topology. 20 To exhibit this re- 
interpretation of the Schwarzschild singularity, we introduce a new set of co- 
ordinates r\ 0, (p , defined by 


r ! 2 - r 2 EE T 2 


r 

2 GM 



2 r'T 


r 


f 2 


+ t 


t 2 


tanh 


t 

2MG 


( 8 . 8 . 1 ) 

( 8 . 8 . 2 ) 


where T is an arbitrary constant. The Schwarzschild solution (8.2.12) then becomes 


dr 2 = ex P ~ d r>2 ) — r 2 d0 2 — r 2 sin 2 0 dtp 2 

(8.8.3) 

where r is now to be understood as a function of r' 2 — t' 2 defined by Eq. (8.8.1). 
The metric is nonsingular as long as r 2 is well defined and positive-definite, that is, 
as long as 

r' 2 > t' 2 - T 2 


Hence, during the time interval 0 < f < T, the metric is a perfectly smooth 
finite function of r' for all real r'. Indeed, even g 9d and g w do not vanish when 
r' — 0, so that when we approach the origin r' = 0 there is nothing to keep us 
from continuing right through to negative r'l The space described by (8.3.3) is 
therefore singularity -free, but consists of two identical sheets r' > 0 and r' < 0, 
joined in a smooth way by a branch point at r' = 0. When t' reaches the time T 
the two sheets detach from each other, and thereafter the metric has a real sin- 
gularity at r' = + \ft' 2 — T 2 , that is, at r = 0. However, even so, the metric 
has no singularity at the radius r' = t’ that corresponds to the Schwarzschild 
radius r = 2GM. 

To repeat, this discussion of the Schwarzschild singularity does not apply to 
any gravitational field actually known to exist anywhere in the universe. Indeed, 
it does not even apply to gravitational collapse (see Section 11.9) because for 
t ' < T space is empty for all r'. However, like Aesop’s fables, it is useful because 
it points to a moral, that what appears in one coordinate system to be a singularity 
may in another coordinate system have quite a different interpretation. 
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‘‘I think Isaac Newton is 
doing most of the driving 
right now.” Major William 
A. Anders , during return 
flight of first circumlunar 
voyage , 12 / 26 / 68 


9 POST-NEWTONIAN 
CELESTIAL MECHANICS 


The Einstein field equations are nonlinear, and therefore cannot in general be 
solved exactly. It is true that, by imposing the symmetry requirements of time 
independence and spatial isotropy, we were able to find one useful exact solution, 
the Schwarzschild metric, but we cannot actually make use of the full content of 
this solution, because in fact the solar system is not static and isotropic. Indeed, 
the Newtonian effects of the planets’ gravitational fields are an order of magnitude 
greater than the first corrections due to general relativity, and completely swamp 
the higher corrections that are in principle provided by the exact Schwarzschild 
solution. 

What we need then is not to find more exact solutions, but rather to develop 
some systematic approximation method that will not rely on any assumed sym- 
metry properties of the system. There are two such methods that have been 
particularly useful; they are called the post-Newtonian approximation 1 and the 
weak-field approximation. The first is adapted to a system of slowly moving 
particles bound together by gravitational forces, such as the solar system, and is 
the subject of this chapter. The second method treats the fields in a lower order of 
approximation but does not assume that the matter moves nonrelativistically ; 
it is therefore suited to handle the subject of gravitational radiation, and will be 
discussed in the next chapter. There obviously is an area of overlap between the 
two approximation methods, that is, for slowly moving particles moving in very 
weak fields, but it is best to keep them separate because of their separate 
applications. 

The post-Newtonian approximation was historically derived 1 as a by-product 
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of the study of the problem of motion : Do the equations of motion of massive 
particles follow from the gravitational field equations alone? According to the 
point of view adopted in this book, the equations of motion in general relativity 
should be derived from the equations of motion in special relativity and the Principle 
of Equivalence. Therefore, in this chapter the post-Newtonian approximation 
is discussed for its own sake, not as part of the problem of motion. 


1 The Post-Newtonian Approximation 

Consider a system of particles that, like the sun and planets, are bound to- 
gether by their mutual gravitational attraction. Let M , r, and v be typical values 
of the masses, separations, and velocities of these particles. It is a familiar result 
of Newtonian mechanics that the typical kinetic energy ^Mv 2 will be roughly of 
the same order of magnitude as the typical potential energy GM 2 jr, so 




GM 

f 


(9.1.1) 


(For instance, a test particle in a circular orbit of radius r about a central mass M 
will have velocity v given in Newtonian mechanics by the exact formula v 2 = 
GMjr.) The post-Newtonian approximation may be described as a method for 
obtaining the motions of the system to one higher power of the small parameters 
QM/r and v 2 than given by Newtonian mechanics. It is sometimes referred to as 
an expansion in inverse powers of the speed of light, but since in our units this 
speed is unity we prefer to say that our expansion parameter is v 2 , or equivalently, 
GM/f. 

We must begin by asking what we need. The equations of motion of the particles 
are 

d 2 x fl dx v dx k 

* -° 


From this we may compute the accelerations as 

d 2 x l _ /dt\~ 1 d [/dA- 1 dx l ~\ 

dt 2 ydry dr|_ydTy dr J 

_ /dA~ 2 d 2 x l /dA -3 d 2 t dx l 

\dr I dr 2 \dr I dr 2 dr 


= -r 


vA 


dx v dx k 
dt dt 


+ r° 


dx v dx k dx l 
dt dt dt 
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This 


may be written in more detail as 

dV 


dx j 

DO — *0 j 

4- fr° + 2r® . — 


= - r ‘" - 2r 


• dx J dx k 
° j dt jk 


dt dt 

fj-y'j r ! I rim'* 


(9.1.2) 


In the Newtonian approximation discussed in Section 3.4 we treated all velocities 
as vanishingly small and kept only terms of first order in the difference between 
< 7 mv and the Minkowski tensor and we found 


d 2 x l 
dt 2 


~ 1^00 
2 dx‘ 


But g 00 — 1 is of order GMjr , so the Newtonian approximation gives d 2 x l jdt 2 to 
order GM/r 2 , that is, to order v 2 jr. Therefore our objective in using the post- 
Newtonian approximation will be to compute d 2 x l jdt 2 to order v 4 /f. Inspection of 
(9.1.2) shows that we shall need the various components of the affine connection 
to the following orders : 

v 4 

We need F^o to order 

r 

v z 

We need T l 0j to order — 
r 

v 2 

We need r*- ft to order — 
r 

V 3 

We need r° 00 to order — 
r 

v 2 

We need r° 0/ to order — - 
r 


We need r°. fc to order ^ (9.1.3) 

r 

From our experience with the Schwarzschild solution, we expect that it 
should be possible to find a coordinate system in which the metric tensor is nearly 
equal to the Minkowski tensor r ^ v , the corrections being expandable in powers 
of MGjf ~ v 2 . In particular, we expect 

2 4 

£7oo = — 1 + <7oo + ffoo + * * ’ 

2 4 

ffij = s ij + 9 ij + 9ij + ■ ■ ■ 

3 5 

9i0 = 9i0 + 9i0 + ' ‘ * 


(9.1.4) 

(9.1.5) 

(9.1.6) 



214 


9 Post-Newtonian Celestial Mechanics 


N 

the symbol g MV denoting the term in g MV of order v N . Odd powers of v occur in 
g i0 because g i0 must change sign under the time-reversal transformation t -*■ —t. 
The real justification for these expansions will come below when we show that they 
lead to a consistent solution of the Einstein field equations. 

The inverse of the metric tensor is defined by the equations 


We expect that 


9*90? = 9 i0 9oo + g iJ 9jo = 0 

(9.1.7) 

g 0,, 9o ? = ? oo 0oo + g ot 9oi = 1 

(9.1.8) 

9*9 j? = 9 i0 9j0 + 9*9 j* = &tj 

(9.1.9) 


g 00 = -1 + I 00 + g 00 + ••• 

(9.1.10) 

2 4 

g' J = S tJ + g‘ J + g‘ J + ■■■ 

(9.1.11) 

3 5 

g l ° = g l ° + g 10 + ••• 

(9.1.12) 


and inserting these expansions into the defining equations (9.1.7)-(9.1.9), we find 

2 2 2 2 3 3 

g 00 = -Poo 9 lJ = -9ij g i0 = 0,0 etc. ( 9 . 1 . 13 ) 

The affine connection may now be obtained from the familiar formula 


p* 


— J -a^P 
vA 2 a 


Sg„, dg Ql 
Sx k 8x v 


8g va 
dx p 


In computing F^ vA we must take into account the fact that the scales of distance 
and time in our systems are set by r and fjv, respectively, so the space and time 
derivatives should be regarded as being of order 

_d_ i d_ v 

dx l f dt f 


Using our estimates (9.1.4)-(9.1.6) and (9.1.10)— (9. 1.13) we find that the components 
r*ooj C l jk > and r° 0j - have the expansions 

= r\A + r" vA + • • • (for r' 00> r'*, r° 0j ) (9.1.14) 

while the components r' o; , r° 00 , and r 0 i( have the expansions 

r\i = Ki + K* + • • • (for r‘ 0J , r° 00 , r°„) (9.1.15) 
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N 

the symbol r /i vA denoting the term in of order v N lf. The components called for 
by (9.1.3) are explicitly 


F 1 ’ — 

1 00 — 


00 


2 

1 d9oo 

2 8x‘ 


4 3 2 

1 %oo , <Vj 0 , , 2 8g 00 

2 8x' 8t 2 ,J 8x J 


oj 




'Sg i0 


Sgu 


+ -JLU 

dx J dt 


dgjo 

dx 1 


3 

r° 


00 


2 

d la , 

8x k 

2 

1 dffoo 

2 dt 


4* 

dx j 


t] 

k _ 3gjk 1 


2 

r° 


0 i 


1 

r 0 -- 

x ij 


T ^00 
2 fo* 


- 0 


(9.1.16) 

(9.1.17) 

(9.1.18) 

(9.1.19) 

(9.1.20) 

(9.1.21) 

(9.1.22) 


Evidently, we shall have to know the components g tj to order v 2 , g i0 to order 
v 3 , and g 00 to order v 4 . This should be contrasted with the Newtonian approxima- 
tion, in which we needed g 00 to order v 2 and g i0 and g tj - only to zeroth order. 

To calculate the Ricci tensor we shall use Eq. (6.1.5) : 


— R hXk 


_ 

dx K 


ar ; 


dx' 


+ r\ x r 2 Kfl 


r * r x 

1 Li* 1 


From (9.1.14) and the expansions (9.1.15) and (9.1.16) we find that the components 
of R^ k have the expansions : 


2 4 


^00 — ^00 “b ^00 + 

(9.1.23) 

3 5 


RiO — R i0 + ftiQ + * * * 

(9.1.24) 

2 4 


C: 

11 

+ 

+ 

(9.1.25) 
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where R denotes the term in R of order v N /r 2 . The terms we can 
the ‘ ‘known” terms in the affine connection are 


calculate from 


Using (9.1.16)-(9.1.21), these give 


l d‘g„ 


+ iV 2 9oo ~ i 9u 


(9.1.26) 


-^oo 


I ; I 


(9.1.27) 


B 

i0 “ at 


(9.1.28) 


8r° in . 81* n 


(9.1.29) 


■Knn = 


(9.1.30) 


1 1 8 ?!! 

2\dx J 


l (Sg 00 \(Sg 00 \ , 1 (Sg 00 



(9.1.31) 


R i0 ~ 


1 d %L _ 1 „ 1 + i V V 

2 dx 1 dt 2 dx l dx j 2 dx j dt 2 l0 


(9.1.32) 


Ri ; - - 


i d 2 g 0 o , i d 2 g k k 1 Pg* 1 d2 g k 


2 dx l dx J 2 dx l dx J 2 dx dx J 2 dx dx 


- - + i 


(9.1.33) 


A tremendous simplification can be achieved at this point by choosing a 
suitable coordinate system. We showed in Section 7.4 that it is always possible to 
define the x M so that they obey the harmonic coordinate conditions 

<r r A „ v = o (9.1.34) 

Using (9.1. 10)— (9. 1.13) and (9.1. 16)-(9. 1.21), we find that the vanishing of the 
third-order term in ^ v r°„ v gives 


o = I^oo Sgpi , 1 Sg u 

2 dt dx 1 2 dt 


(9.1.35) 
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2\J 


while the vanishing of the second- order term in g /lv F , ^ v gives 

2 


It follows that 


2 2 
0 _ 1 ggoo Sg l} I dg Jt 

2 dx 1 dx j 2 dx l 


1 &2ii - d2 9‘° + 1 d 2 goo = 0 

2 5£ 2 dx 1 dt 2 d£ 2 


(9.1.36) 


S 2 g u S 2 g i o 
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5a; 1 5a; J 
2 


d 2 g.-,- 

2 


+ 


d 9u 


d 9jj , d 9oo 


0 


dx k dx j dx- 1 dx 1 dx 1 dx k dx 1 dx k 

(9.1.30)-(9.1.33) now give simplified formulas for the Ricci tensor: 


2 2 
^00 = ^ 2 Qoo 


4 4 i d 2 n 

7? _ iv 2 // v y°° 

-^oo “ 2 V UOO 


2 dt- 


2 

\9ij 


+ i{v4 ™ )2 


Rot = iV 2 g , 0 

ij - iv% 


(9.1.37) 

(9.1.38) 

(9.1.39) 

(9.1.40) 


We are now ready to make use of the Einstein field equations, which we take 
in the form 


R „ = -8 nQ(T m - i 9llv T\) (9.1.41) 

From their interpretation as the energy density, momentum density, and momen- 
tum flux, we expect that T 00 , T l °, and T lJ will have the expansions 

0 2 

T oo = T oo + T oo + ... (9.1.42) 

1 3 

T i 0 = T i0 + T i0 + ••• (9.1.43) 

2 4 

T ij = T ij + T iJ + ••• (9.1.44) 

N 0 


where T^ v denotes the term in T^ v of order (Mjr 2 )v N . (In particular T 00 is the 

2 

density of rest-mass, while T 0Q is the nonreiativistic part of the energy density.) 
What we need is 


- \g^T\ 


(9.1.45) 
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But GMjr is of order v z , so (9.1 .4)— (9-1.6) and (9- 1 .42)— (9. 1 .44) give 


o 2 



^oo — ^00 "f ^00 + ’ * ’ 

(9.1.46) 


1 3 

$i 0 = s t0 + £>io + * * * 

(9.1.47) 


&ij = + * ■ ■ 

(9.1.48) 

where S denotes the term in S^ v of order Mv N /r 3 . In particular 



0 0 

8 00 = iT°° 

(9.1.49) 


S 00 = i[T 00 - 2g 00 T 00 + T‘*l 

(9.1.50) 


1 1 

S i0 = -T 0i 

(9.1.51) 


8 tJ = +i S,jT 00 

(9.1.52) 

Using (9.1.37)— (9.1.40) and (9.1.46)-(9.1.52) in the field equations (9.1.41), we 
find that the field equations in harmonic coordinates are indeed consistent with 
the expansions we have been using, and give 

V 2 </oo 

0 

= -8nGT 00 

(9.1.53) 

V 2 Soo 

_S 2 g 00 2 8 2 g 00 f8g 00 \/8g 00 \ 

dt 2 u dx l dx j y dx l J y dx l J 



00 

s 

o 

0 

1 

tNS 

K> 

o 

© 

o 

o 

+ 

(9.1.54) 

v 2 L 

1 

= +16 nGT i0 

(9.1.55) 

2 

V 2 ?« 

= — 8jr GS,jT 00 

(9.1.56) 

From (9.1.53) we find as expected 



2 

9oo = -20 

(9.1.57) 

where 0 is the Newtonian potential, defined by Poisson’s equation 



< 

K) 

~e- 

ll 

o 

o 

(9.1.58) 


2 

Also g 00 must vanish at infinity, so the solution is 




t) = -G 


\x - x'| 


(9.1.59) 
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From (9.1.56) we find that the solution for < 7 .. that vanishes at infinity is 


9ij = - 2 $ij<f>. 

3 

On the other hand, g i0 is a new vector potential { : 


(9.1.60) 


Sto = ti 

and the solution of (9.1.55) that vanishes at infinity is 


(9.1.61) 


(t(x, t) = -AG 




|x - x'l 

Finally, we may simplify (9.1.54) by using (9.1.57), (9.1.58), and the identity 

= 1 V 2 ^ 2 - (j)Y 2 (j) 
dx l dx l 2 


(9.1.62) 


The result is 


where ij/ is a second potential 


9oo ~ ~ %4> 2 — 2ijj 


p 2 ' 2 2 

VV = + AnG[T 00 + T H ] 

dt 


(9.1.63) 


(9.1.64) 


Again, g 00 must vanish at infinity, so the solution is 
d 3 x' 


^(x, t) = - 


I 


|x - x'l 


— + GT 00 (x', t) + GT"(x’, 

An dt 2 


H (x' t ) 


(9.1.65) 


The coordinate condition (9.1.35) imposes on 4> and £ the further relation 


4^ + v-f = 0 

dt 


(9.1.66) 


while the other coordinate condition (9.1.36) is now automatically satisfied. We 
shall see in Section 3 that (9.1.66) is also satisfied by our solutions, by virtue of the 
conservation conditions obeyed by T MV . 
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Using (9.1.57), (9.1.60), (9.1.61), and (9.1.63) in (9-1. 16)— (9.1.22) gives the 
desired components of the affine connection 


r 1 — 

1 00 — 


dcj) 

dx 1 


ho = T“7 (2* 2 + 4>) + 5 

dx 1 of 


r‘ 

1 o j — 


2 \dx J dx' 


5 ti 


d(j) 

Tt 


r> * = ~ s u s 


d$_ 

dx j 


d(j) 

dx 1 


r° — 
A oo ~ 


dt 


01 dx 1 


(9.1.67) 

(9.1.68) 

(9.1.69) 

(9.1.70) 

(9.1.71) 

(9.1.72) 


As a bonus, we can now also calculate three additional terms in the affine con- 
nection that will play a role in post-Newtonian hydrodynamics : 


3 

T 0 , 


1(& + XA- Sll °* 

z\dx‘ dx' J dt 


yO = 

i0 dx‘ 

dt 


(9.1.73) 

(9.1.74) 

(9.1.75) 


2 Particle and Photon Dynamics 

Before continuing our calculation of the post-Newtonian metric, we shall 
take a quick look back at the problem with which we started, that of computing 
the acceleration of a freely falling particle to order v A jf. (Detailed applications 
of the post-Newtonian method are given in Sections 9.5-9.9.) Inserting the terms 
(9.1 .67)— (9.1.72) of the affine connection into (9.1.2) immediately gives the equation 
of motion: 

— = -VW> + 24 > 2 + I/O - ^ + V x (V x 0 
dt dt 

+ 3v — + 4v(v • \)(j> — v 2 V(j) (9.2.1) 

dt 


where v l = dx'jdt. 
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In addition, we need to know how to convert the harmonic coordinate time t 
into the proper time t measured on a freely falling body of velocity v. By definition 


To order v 4 , this is 



-ffoo - W>' - 9i /V 



2 4 3 2 

1 - [V 2 + 000] - [000 + 2 0io l ’ i + 9ijV‘v J ] 


or using (9.1.57), (9.1.60), (9.1.61), and (9.1.63): 



= 1 + [20 - v 2 ] + 2[<£ 2 + 0 - ( • v + 0v 2 ] 


The brackets enclose terms of order v 2 and v 4 . By using the power series expansion 
of >/l + x , we find to order v 4 : 


or 


_ = 1 + (j) - iv 2 - J(20 - v 2 ) 2 + ^ 2 +0-£*v + (j)V 2 

dt 



(9.2.2) 


where 

L = |v 2 - <t> ~ 2 ~ f0v 2 + i(v 2 ) 2 - 0 + £ * V (9.2.3) 

Since J (dr/dt) dt is stationary, we can regard L as the Lagrangian for a single 
particle, and we can derive the equations of motion from the Lagrange equation 


d L dL _ dL 
dt dv l dx l 


(9.2.4) 


(Acting on cj> or f, d/dt is to be taken as d/dt + v • V.) The reader may readily 
check that (9.2.4) agrees with Eq. (9.2.1). 

The post-Newtonian fields can also be used to calculate the acceleration of a 
photon in a gravitational field to order v 2 . (Here v is of course not the photon 
speed; it is the typical velocity of the material particles of the system.) Since the 
velocity = dx'jdt of the photon is of order unity, Eq. (9.1.2) now gives its 
acceleration as 

^ = -f'oo - r >,-% + 2tt,r° 0 jWj + 0(5 3 ) 
at 

Using ( 9 . 1 . 67 ), ( 9 . 1 . 70 ), and ( 9 . 1 . 72 ), this gives 

— = -(1 + U 2 )V0 + 4u(u • V0) 4- 0(^ 3 ) 
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We also note that the photon speed is given by the condition 

o = - 7 - — = 1 - u + 2(1 + u 2 )(j b + 0(V 3 ) 

dt dt 

or 

|u| = 1 + 2(f) + 0(v 3 ) (9.2.5) 

Hence to the required accuracy we can replace u 2 by unity in the photon 
acceleration : 

— = — 2V</> + 4u(u • V</>) + 0(« 3 ) (9.2.6) 

dt 

It is somewhat more convenient to write this as an equation for the unit direction 
vector u = u/|u| : 

— = u x (u x \<j>) + O^ 3 ) (9.2.7) 

dt 


3 The Energy-Momentum Tensor 


In order to complete the computational program outlined in Section 1, we 
must show how to calculate the energy-momentum tensor T^ v that serves as the 
source of the gravitational fields. We shall first consider how the conservation 
laws of energy and momentum appear in the post-Newtonian approximation. In 
general, the conservation laws read = 0, or in more detail 


dx M 


rpfix _ j - 1 v rptik j-Vi 


(9.3.1) 


The term of order Mvjr 4 with v = 0 gives 

a 0 a 1 

- T 00 + A T i0 = 0 (9.3.2) 

dt dx 1 


since all Hs are at least of order v 2 jr. This may be regarded as the law of conserva- 
tion of mass ; it should not surprise us to find mass conserved in the post-Newtonian 
approximation, for a large rate of conversion of mass into energy would produce 
temperatures at which the particles of the system moved relativistically, in conflict 
with the assumption that v 1. Apart from its intrinsic importance, Eq. (9.3.2) 
is here indispensable to us, because it is needed for consistency of the harmonic 
coordinate conditions. From (9.1.53) and (9.1.55), we see that (9.3.2) implies 


0 = V 2 


-2 


S 2 g 00 

dt 


8g 0i 

8x‘ 


= v : 
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Since (/> and £ must vanish at infinity, we conclude that 


4 

dt 


+ v-c 


0 


thus verifying the coordinate condition (9.1.66). 

Returning to Eq. (9.3.1), we find that the v = i term of order Mv 2 /? 4 gives 


or, using (9.1.67), 


r) 1 d 2 

_ fpOi _j_ u rpij 

dt dx l 


2 0 

r'oo ^ 00 


jiOi _| rp ij _ _ ^ rpOO 

dt dx l dx l 


(9.3.3) 


Since T lj is the flux of momentum, we recognize in this the conservation of 

momentum; note that the right-hand side is just the Newtonian gravitational 

0 

force density, equal to the mass density T 00 times — V</>. 

There are no other conservation laws which involve only the terms in T 

needed to calculate the fields in the post-Newtonian approximation, that is, 
° 2 i 2 

only T 00 , T 00 , T l °, and T lJ . Further, we note that the two conservation laws 

(9.3.2) and (9.3.3) involve g only through <^>, which can be calculated in the 
Newtonian approximation. Hence, the procedure to be followed is essentially 

iterative. We must first solve the Newtonian equations of motion, use the solution 

021 2 

(plus the equations of state) to determine the terms T 00 , T 00 , T l0 , and T lj , 
compute the post-Newtonian fields {[/ and £, recompute the motions of the particles, 
and so on. It can be shown 1 that this procedure can be kept going; that is, to 
compute the fields in the Nth approximation we need to know terms in T that 
satisfy conservation laws that involve the fields only in the (N — 1 )th approxima- 
tion. We shall content ourselves here with writing the conservation laws that 
govern terms in T* v of the next highest order than those appearing in (9.3.2) and 

(9.3.3) . The v — 0 term of (9.3.1) of order Mv 3 jf 4 and the v = i term of (9.3.1) 
of order Mv 4 /f 4 give 

l T ° 0 + A y.-o = _ (2r° 00 + f' w )T°o - (3r 0 0i + PjdT 01 

dt dx l 

23 24 40 22 

__ rpiO _j_ u rpij — pi rpOO pi rp 00 

dt dx j 

- ( 2 P 0J . + i u r° 00 + 

- (r 1 * + r° 0j - s ik + r‘jt*)T Jk 
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or using (9.1.67)-(9.1.72) 

rpOO _j_ & rpiO rpOO ,0 g 

dt dx l dt 

- T 0i + A T iJ = —T 00 I" — . ( 2 4> 2 + i) + — f ] - T 00 A 

dt dx j \_dx l dt J 5a; 1 

|_5x J dx l J dt J 

,9 - 3 - 51 

2 3 4 - 

As promised, the source terms T 00 , T l °, and T lj , which would be needed to calculate 
the ^o«s£-post-Newtonian fields, obey conservation laws that involve the metric 
only to the post-Newtonian order. 

We still need a model with which to calculate the energy-momentum tensor. 
The simplest such model is that of an assembly of freely falling particles that 
interact only gravitationally and, perhaps, in localized collisions. From Eq. 
(5.3.5) we have 


2^(x, t) = g- ^(x, ‘ <5 3 (x - x n (t)) (9.3.6) 

n dt dt y dt J 

where m„, x%(t), and r n are the mass, space-time position, and proper time of the 
wth particle, and — g is the determinant of g . An elementary calculation using 
Eq. (4.7.5) gives 

2 4 

9 = 1 + g + g + * * • 

N 

where g is of order v N , and in particular 

9 = rj MV ^ V = -9o o + 9u = (9-3.7) 

Using (9.3.7) and (9.2.3) in (9.3.6), we find 
o 

T 00 = X - *„) (9.3.8) 

n 

T 00 = I m a (<j> + iv„ 2 )^ 3 (x - x„) (9.3.9) 

n 

T i0 = - x„) (9.3.10) 

n 

T iJ = £ m n v„‘v„ J S 3 (x - x„) 


(9.3.11) 
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where v n = dxjdt. To impose the conservation laws, we must recall that 


^ <5 3 (x - x„(<)) = vj ~ <5 3 (x - x„(t)) = -r„ ■ VS 3 (x - xjt)) 

dt dx„ 


so 


f) ° pi 

_ rpOO u rpiO _ 

dt dx l 


- T°‘ + T 11 = ?«,*■' <5 3 (x - xj 
dt 8x‘ *"dt 


We see that the mass-conservation equation (9.3.2) is automatically satisfied, 
while the equation (9.3.3) of momentum conservation is satisfied if and only if 
each particle obeys the Newtonian equation of motion : 


dt 


-V0(ag 


(9.3.12) 


The program for calculating the motion of a set of gravitating point particles is 
therefore 

(A) Solve the Newtonian problem; that is, solve Eqs. (9.3,12) and (9.1.58) 
for <f)(x) and x„(£). (This is the only step that is not always straightforward.) 

(B) Use the results of (A) and Eqs. (9.3.8)-(9.3.11) to compute the terms 
0 2 1 2 

T 0°, T 00 , T l °, T lj of the energy-momentum tensor. 

(C) Use the results of (A) and (B), and Eqs. (9.1.62) and (9.1.65), to compute 
the post-Newtonian fields f and \jj. 

(D) Use the results of (A) and (C) and Eq. (9.2.1) to calculate the post- 
Newtonian corrections to the trajectories x n (f). 

(E) And so on. 


4 Multipole Fields 

As a first example, let us calculate the gravitational field far away from an 
arbitrary finite distribution of energy and momentum. Let T MV (x, t) vanish for 
r > R, where r = |x|. We may then expand the denominators |x — x'| of Eqs. 
(9.1.59), (9.1.62), and (9.1.65) in inverse powers of r/f? 


|x - x'l 


- 1 


1 x x' 

■"» - + — 5 ^ 

r r * 


(9.4.1) 
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and we find 


, GM Gx D \ 

4 >^ 3 +oh 

r r J Vr 


(9.4.2) 


l l 

4 GP; 2 Gx j J 




( 9 . 4 . 3 ) 




r r' 


(9.4.4) 


o p o 

m = hr 00 d 3 x 


M 


= xP uu d- 


(9.4.5) 


(9.4.6) 


P 1 = T i0 d 3 x 


J n = 2 X { T j0 d 3 L 


2 c 


2 2 
{T ° 0 + T u ) d 3 x 


( 9 . 4 . 7 ) 


(9.4.8) 


d. + + 

l 4 kO St 2 } 


( 9 . 4 . 10 ) 


(The term d 2 (fildt 2 does not contribute to M, because it equals — iV ■ (d£/<9£)* 
and hence vanishes upon integration.) 

The field \j/ has physical effects only through its presence in the expansion of g 00 : 
<7oo = -1 _ 2<j, - 2^ - 2<j> 2 + 0(v 6 ) 

Evidently, we can take account of \J/ by simply replacing (f) everywhere with \J/ + (f>. 
That is, within the accuracy of the post-Newtonian approximation we may write 


<7oo = — 1 — 2((j) + \j/) — 2{(j) + ij/) 2 + 0(^’ 6 ) 


Equations (9.4.2) and (9.4.4) give the physically significant field (j) + as 


<t> + 


GM Gx V / 1 

— + Oh 

r 


(9.4.12) 


(9.4.13) 
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The quantity D does not represent an effect of physical importance, but rather 
just a displacement of the whole field, for (9.4.12) may be written 




CM 

|x - D/if | 



(9.4.14) 


We could have avoided the D term altogether by defining our coordinate system 
with its origin at the center of energy. On the other hand, the 1 jr and Ijr 2 terms 
in the expansion (9.4.3) for f are true physical effects of great interest. 

We can derive a number of useful properties of the moments of T /iV by making 
use of energy and momentum conservation. From the mass- conservation equation 
(9.3.2) it follows that in general 


o 

dM 


0 


dt 


(9.4.15) 


o 

dD 

dt 


i 

P 


(9.4.16) 


If the energy -momentum tensor is time independent then (9.3.2) reads 

F) 1 

— . T i0 = 0 
dx l 


and therefore, integrating by parts, 

.. 5 1 


0 = 


0 = 2 


x‘—Tj°d 3 x = -P‘ 
dx J 

pi ii 

xW — T k0 d 3 x = -J,, - J n 
8x k ,J 1 


( 9 . 4 . 17 ) 

(9.4.18) 


The result that P vanishes for a static system is hardly surprising. The result that 

i 

J . . is antisymmetric is not so obvious, and allows us to write it as 


Jij ~ 8 ijk^k 


where J k is the angular momentum vector 


Jk — 2 £ ijk^ij — 


Using (9.4.17) and (9.4.19) in (9.4.3) gives 


d 2 xs ijk x i T j0 


„ 2 0 . l 

— ( xxJ ) + 0 - 
r A Xr 5 


(9.4.19) 


(9.4.20) 


(9.4.21) 
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Our results (9.4.14) and (9.4.21) for <f> + \j/ and £ hold generally only far from 
the gravitating mass. However, they also hold right down to the surface of a 
spherical distribution of energy and momentum. First suppose that T^ v (x, t) 
depends on the position x only through the radius r = |x|. Then the factor 
|x — x'| -1 can be replaced in (9.1.59), (9.1.62), and (9.1.65) with its angular 
average, which for r > r' is 

J_ f dQ _ 1 ^ sin 0 d6 _ 1 

4n J |x — x'| 2 i0 [r 2 - 2 rr' cos 0 + r' 2 ] 1/2 r 


Hence everywhere outside the sphere the fields are 

o 


4> = 

GM 

r 

(9.4.22) 


l 


C = 

P 

- 4 G- 
r 

(9.4.23) 

* = 

2 

GM 

(9.4.24) 


if the sphere is at rest, then F vanishes; in this case (9.1.57), (9.1.60), (9.1.61), 
(9.1.63), and (9.4.13) give the metric as 


2 MG 

9 oo - - 1 + 

r 


2 M 2 G 2 


(9.4.25) 


9io - 0 

9ij — &ij + 2 <S t j 


MG 

r 


(9.4.26) 

(9.4.27) 


This is in agreement with the exact Schwarzschild solution, given in harmonic 
coordinates by Eq. (8.2.15): 


9 oo — 


1 - MGjr 
1 + MGjr 


9i0 = 0 


9ij = 


, MG Y s 

i + —) t„ + 


MG\ 2 1 + MGjr 
r ) 1 - MGfr 



However, there is an important difference in the two derivations, in that the exact 
Schwarzschild solution was derived in Section 7.2 for a static spherically symmetric 
system, while the post-Newtonian solution is valid for a system that can vary over 
times of order fjv. It will be shown in Section 11.7 that the Schwarzschild solution 
is actually valid outside any spherically symmetric system, static or not. 
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Now consider a system that is at rest and spherically symmetric, but rotates 
with angular frequency <o(r). The momentum density is now given by 




fOO/ 


Equation (9.1.62) then gives the field £ as 

CM 

The solid-angle integral is now 


- w \' 


|x - x'l 


C dQ'x' 

J |x - x'l 



■') X x']f 

(9.4.28) 

0 

T 0G (r') d 4 x' 

(9.4.29) 

for r' < r 

(9.4.30) 

for r' > r 

(9.4.31) 


Thus, the field outside the sphere is 

I67 zG 


CM = 


3 r 


x x 


| to(r')T 0 »' 4 dr'J 


(9.4.32) 


The integral may be expressed in terms of the angular momentum, given by 
Eqs. (9.4.20) and (9.4.28) as 


J ■ I 1 " 


x [to(r') x x'])T<> 0 (r') d*x' 


[r' 2 (o(r') - x'(x' ■ ca(r'))]T 00 (r') d 2 x' 


87T 

y 


G)(r')T 00 (r')r ' 4 dr' 

Thus (9.4.32) gives, everywhere outside the sphere, 

2 G 

CM = "t ( x x j) 


(9.4.33) 


(9.4.34) 


in agreement with the general asymptotic formula (9.4.21). The field inside a 
hollow spinning sphere is given by (9.4.29) and (9.4.31) as 


where 


a = 


f(x) = X x n 
WnG 


C) 


(r')T 00 (r')r' dr' 


(9.4.35) 

(9.4.36) 


The implications of this result for Mach’s principle are discussed in Section 9.7. 
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5 Frecession of Ferihelia 


We shall now see how the post-Newtonian formalism developed in the last 
four sections can be used to calculate the precession of planetary orbits in the 
actual solar system, taking into account other planets, solar rotation, solar 
oblateness, and so on. The potential (ft + \p that determines g 00 [see Eq. (9.4.11)] 
is overwhelmingly dominated by the spherically symmetric part —GM 0 lr of 
the sun’s contribution, so it is convenient to write 

(j> + + £ ( x , t) (9.5.1) 

r 


with e including not only the Newtonian potentials of the other planets but also 
any quadrupole or higher terms in the sun’s contribution to (p + i j/. The equation 
of motion (9.2.1) of a point particle now reads 


dw 

dt 


GM Q x 


+ n + o(«> 6 ) 


with ly a small perturbation : 




iy = — V(s + 20 2 ) - ^ + v x (V x 0 


dt 


d(j> 


+ 3v ^ + 4v(v • \)(j) - v 2 \(f) 
dt 


(9.5.2) 


(9.5.3) 


By far the most convenient technique for calculating the precession of perihelia 
is to compute the rate of change of the Runge-Lenz vector 


A = —M 0 G~ + (v x h) (9.5.4) 

r 

Here r = |x|, v = dx/dt, and h is the orbital angular momentum per unit mass: 

h = x x v (9.5.5) 

If the perturbation i; in Eq. (9.5.2) were absent, then the orbit would be an ellipse, 
described by the familiar formulas 

L 

r = 

1 + e cos (q> — (p 0 ) 

a<p _ -Jlm 0 g 

dt r 2 

~ = e sin (<P - <Po) (9.5.8) 


(9.5.6) 

(9.5.7) 
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with e the eccentricity and L the semilatus rectum. (See Section 8.6. We are taking 
the orbit to lie in the plane 9 = 7 t/ 2 , with perihelia at an azimuthal angle (p 0 .) 
Then h would be a constant vector normal to the orbit, with magnitude 

|h| = -JLM 0 G (9.5.9) 

and A would be a constant vector pointing toward perihelion , with magnitude 

|A| = eM Q G (9.5.10) 

Thus, the rate of precession of perihelia dq) Q jdt caused by any perturbation is just 
the component of the change dAjdt in the unit vector A = A/|A| along a direction 
perpendicular to both A and h, that is 


dA 


#0 

dt 


= (h x A) 


dA 

dt 


(h x A) • 


dt 

NA 5 


(9.5.11) 


(If d(p 0 jdt is positive, then the precession is in the same sense as the direction of 
the planet’s motion.) A straightforward calculation gives for the rate of change of 
A produced by the perturbation 17 in (9.5.2) the value 


dA 

dt 


= 17 x h + v x (x x 17) 


(9.5.12) 


Note that dA/dt and hence d(p 0 /dt is linear in 17, so d(p 0 /dt is correctly calculated by 
adding up the precessions produced by each small term in 17. 

The largest term in 17 is the part of — \e arising from the Newtonian potentials 
of the other planets. We shall make no attempt to calculate this term; the experts 
tell us that it produces a precession dcpjdt, which for Mercury is about 532" per 
century. (See Section 8.6.) The next largest term is obtained from the relativistic 
corrections in Eq. (9.5.3), setting (j> and £ equal to the values they would have for 
a spherical nonrotating sun : 


, GM © „ „ 

4 > G = 2 Co = 0 (9.5.13) 

r 

Then (9.5.3) gives 

17 = —2\(j>Q + 4v(vV)0q ~ v 2 V0 o (9.5.14) 

Using (9.5.12)-(9.5.14) and (9.5.6)-(9.5.10) in (9.5.11) gives the precession as 

= SM Q GhL~ 3 [1 + e cos (<p — cp 0 )] 3 sin 2 (tp — (p 0 ) — M Q Ge~ x hL~ z 
dt 

x {7[1 + e cos {<p - cp 0 )] 2 + 4[1 + e cos (<p - cp 0 )] 3 

+ [1 + e cos (c p - <p 0 )] 4 } cos (<p - (p n ) (9.5.15) 
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Since <p 0 changes slowly, the change in (p 0 in one revolution can be determined 
by integrating d(p 0 jdt over one period, keeping (p 0 fixed in the integrand, and using 
for dcp/dt the Keplerian formulas (9.5.6)-(9.5.10). This gives for the precession per 
revolution 


" 27t d<p 0 dt ^ L 2 %2n dcpo 
0 dt d(p h Jo dt 


1 + e cos (tp — (p Q )\ 2 d(p (9.5.16) 


Most terms drop out on performing the angular integration, and we are left with 


A(p 0 == 6k 

L 


(9.5.17) 


in agreement with our earlier result, Eq. (8.6.11). 

As an example of another small term in the precession, let us calculate the 
effect of the field £ produced by the sun’s rotation. According to (9.4.34), this field is 


r 2G. 

{ = — ( X X Jo) 
r 


(9.5.18) 


This contributes to the acceleration dv/dt an amount given by (9.5.3) as 
t\ = v x (V x £) = 66rh(x * J 0 )r -5 + 2G(v x J 0 )r -3 


and (9.5.12) tells us that this causes A to change at a rate 


(9.5.19) 


— ~ — 66rh(v • x)(x • J 0 )r 5 — 26r{v x J 0 )(x*v)r 3 — 26rv(h*J 0 )r 3 
dt 

(9.5.20) 

For simplicity we will take the sun’s axis of rotation to be normal to the plane of 
the planet’s orbit, so that J 0 is parallel to h. Using (9.5.20) and (9.5.6)-(9.5.10) 
in (9.5.11) gives for the rate of precession 


d(p 0 2J qJi 

~dt ~ Mr,L A e 


~ {“[I + e cos {(p - cp 0 )] 2 e sin 2 {q> - <p 0 ) 


- [1 + e cos ((p - (p 0 )] 3 [e + cos (cp - (p 0 )]} (9.5.21 


ana (y.o.io) men gives lor me precession per revoiunon 

, —SkJcJi 


(9.5.22) 


The sun is generally supposed to have an angular momentum J Q ~ 1.7 x 
10 48 g cm 2 sec -1 , and its mass is M c = 1.99 x 10 33 g, so in our natural units 
with 1 sec = 3 x 10 10 cm we have 


0.28 km 
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Also, the orbit of Mercury has L = 55.5 x 10 6 km and h — 9.03 x 10 3 km, so 
the field £ would contribute to the precession of Mercury’s perihelia an amount 

A<p 0 ^ —2.06 x 10“ 11 radians/revolution 
or in more conventional terms 

A cp 0 — —17.6 x 10“ 4 arc-sec/century 

Even if Dicke and Goldenberg are right, and the sun has an angular momentum 
25 times larger than generally believed, the precession caused by £ is still only of 
order 0.04" per century, too small to be measured. 

Perhaps it should be stressed again that the total precession is to be calculated 
by adding the Newtonian term 532" per century, plus the Einstein term (9.5.17), 
plus the £ term (9.5.22), plus a Newtonian term arising from any oblateness of the 
sun, plus a term arising from the contribution of the sun’s rotation to the aniso- 
tropic part of \j/, plus terms arising from post-Newtonian corrections to the pertur- 
bations caused by other planets. Only the Newtonian terms and the Einstein term 
(9.5.17) are large enough to be measured. 


6 Precession of Orbiting Gyroscopes 


We saw in Section 5.1 that the spin of a particle in free fall precesses 
according to the equation of parallel transport : 


dS, 

dr 


rVA 


dx v 

dr 


(9.6.1) 


A few years ago Pugh 2 and Schiff 3 suggested that a gyroscope might be placed in 
orbit around the earth, and the precession of its spin vector be used to measure the 
fine details of the earth’s gravitational field. Schiff made use of a calculational 
method developed by Papapetrou 4 and Fock 5 in which the motion is first calculated 
for an extended spinning body and then evaluated in the limit as the body size 
goes to zero. We shall instead treat the gyroscope as a point particle from the 
beginning, because for such particles the Principle of Equivalence tells us that there 
is a locally inertial frame of reference in which the spin does not precess, and we 
can use Eq. (9.6.1) as the translation of this statement into a general coordinate 
system. 

The spin four-vector is defined to remain orthogonal to the velocity 
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[See Eq. (5.1.9).] In other words 

S 0 = (9.6.2) 

We set (i = i in (9.6.1), multiply with dzjdt, and use (9.6.2) to eliminate S 0 ; this 
gives 

Yl = rJ i» s j - r °^Sj + r^tfiSj - rVV's,. (9.6.3) 

The post-Newtonian approximation will allow us to evaluate the coefficient of 
Sj on the right-hand side to order v 3 ]r: 

fJ S! 3 2 2 

=* rr y .o - rV J - + r^s y (9.6.4) 

(The last term drops out because there is no jT° ifc .) The components of the affine 
connection are provided by Eqs. (9.1,69), (9.1.70), and (9.1.72). We find 

— ~ ±S x (V x C) - S ^ - 2(v • S)V0 - S(v • \<P) + v(S - ?</>) 
dt dt 

(9.6.5) 

To solve (9.6.5), we make use of the fact that parallel transport preserves the 
value of S^S*, so that [see Eq. (5.1.10)] 

j t (iT-VU = 0 (9-6-6) 

dt 

The rate of change of S is seen from (9.6.4) to be of order S times v 3 jr, so we need 
only keep those terms in g^ v — r whose rate of change is comparable as seen 

by a particle moving with velocity v, that is, those terms whose gradient is of 

2 

order (v 2 /r). Hence g may be replaced in Eq. (9.6.6) with rj^ + g^ v . Further- 

2 

more (S 0 ) 2 is already of order v 2 with respect to S 2 , so we need not keep g 00 . Thus, 
to the order needed here, we expect that (9.6.5) will have the integral 

S 2 + 2</>S 2 — (v • S) 2 = constant (9.6.7) 

This suggests that we should introduce a new spin vector 5^ by 

S = (1 - 4>)& + iv(v • ST) (9.6.8) 

so that to order v 2 S 2 , (9.6.5) reads 

Sf 2 — constant (9.6.9) 

To the required order, we can invert (9.6.8) to read 


Sf = (1 + 0)S - Jv(vS) 


(9.6.10) 
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The rate of change of is given to order ( v 3 l?)S by treating S as constant every- 
where it appears with coefficients of order v 2 , and setting dv/dt ~ — V</>. We then 
find 

d& _ dS g fd$ 

dt dt 

and inserting (9.6.5), we find to order (v 3 jf)^: 


+ v • V0 + i V0(v • S) + iv(S • V<£) 


where 


dSf 

dt 


= ft x Sf 


(9.6.11) 


ft = — JV x f — Jv x ¥(f) 


(9.6.12) 


Eq. (9.6.11) shows that just precesses at a rate |ft| around the direction of ft, 
with no change in magnitude, thus verifying (9.6.9). 

What does this have to do with the measurement of the precession of a 
gyroscope in free fall ? The answer as always is to be sought by reference to the 
method actually used to measure the effect. In the present case the spin direction 
of the gyroscope is monitored by measuring, in the inertial frame moving with the 
gyroscope , the angles 6 between the spin of the gyroscope in this frame and the 
velocity vectors u„ of light rays from one or more distant stars : 


cos 9 = S • U g (9.6.13) 

9 |S,| |u„l 

(This angle can be measured by focusing the star’s image on an array of photo- 
electric cells fixed to the gyroscope, in such a way that a change in 6 causes the 
image to move over the cells, producing a change in the photoelectric current.) 
In the inertial frame of the gyroscope, light moves with unit velocity 

i«j = 1 

the time component of S gtl vanishes [see Eq. (9.6.2) 

= 0 


and the vector has constant magnitude 

IS, I = (WV 12 

Thus, the measured angles 9 can be expressed in the form 


cos 9 


sx 




1/2 


(9.6.14) 


This is now an invariant, so we are no longer restricted to the rather inconvenient 
inertial coordinate system fixed to the gyroscope, but can use for and u * the 
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spin and light- velocity four- vectors in any convenient coordinate system, such as 
the reference frame fixed to the earth. In this frame we have for the velocity 
four- vector of the starlight ray: 


u l = <0 + 
u° = 1 + Su° 


where is a fixed unit vector giving the light velocity far from the earth, and 
5u M is a correction of order M ^Gjr ~ v 2 , arising from the effect of the earth’s 
gravitational field on the speed and direction of light rays. Also, Eqs. (9.6.10) 
and (9.6.2) give, to order v 2 , 

s t = sr t - 4>9 t + m* • &) + Ot^ 4 ) 
s Q = — v ■ 9 + o(^ 3 ) 


Thus (9.6.14) gives the measured angle 0 as 

cos e ^ se • [Uoo - V + 5u - 0u oo + Jv(v • uj] (9.6.15) 

where SP = Sf\\Sf\, The term — v represents the aberration of starlight , an im- 
portant effect known since the eighteenth century, which certainly must be taken 
into account. Apart from this term, cos 0 evidently changes with time because 
<5u, (j), and v change as the gyroscope revolves about the earth, and also because 
£/P precesses with angular frequency SI. Indeed, these fractional changes in cos 6 
produced by the variations in each of <5u, 0, v, and in the course of one revolution 
are, aside from aberration, all of order v 2 [see Eqs. (8.5.8) and (9.6.12)], so in 
order to measure the precession of 9* in one revolution of the gyroscope it would 
be necessary to measure 6 to an accuracy of order 10” 10 radians, and even then 
we would have to disentangle the effect of the bending <5u of starlight and the other 
terms in (9.6.15) in order to interpret the result as a spin precession. Fortunately, 
there is one property of the spin precession’ that distinguishes it from all other 
effects : it is cumulative. After a large number N of revolutions, the spin direction 
§P will have changed by an amount of order Nv 2 , while c)u, 0, and v(v • u M ) will 
still be of order v 2 , so to a good approximation the change in 9 will, after aberration 
is taken into account, be just given by the change in 

A (cos 6) ~ • A9 7 (9.6.16) 

Our conclusion then is that the precession of Sf is a directly measurable effect, 
provided we have the patience to wait for the gyroscope to complete many 
revolutions about the qarth. 

Returning now to the problem of calculating ft : if we regard the earth as a 
rotating sphere at rest, the fields J and 0 can be taken from (9.4.34) and (9.4.22) : 


d>= - 


OM m 

r 


, 20 , . 

C = — (x x J ( 
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The precession frequency (9.6.12) is therefore 

SI = 3Gx(x ■ J®)r~ 5 - Gi m r~ 3 + ^ GM ^ X x v > (9.6.17) 

2r 3 


The last term, which depends only upon the mass of the earth but not its spin, 
is called the geodetic precession ; 6 it is essentially just the Thomas precession caused 
by gravitation. (See Section 5.1.) The first two terms represent an interaction 
between the spin orbital angular momenta of the earth and the gyroscope, analogous 
to the hyperfine interaction of atomic physics. If for simplicity we take the gyro- 
scope’s orbit to be a circle of radius r with unit normal h, then the gyroscope’s 
velocity is 

M (Ts 1 ! 2 

~|- ) (x X h) (9.6.18) 

and the precession rate, averaged over a revolution, is 

<fi> = (*« - + 3(J1 f e G) 3 ' 2 2^ (9.6.19) 

2r J 2 r /z 



Both terms are maximized by taking r as small as possible, that is, about equal to 
the radius of the earth. At these low altitudes the ratio of the first “hyperfine” 

term to the second “geodetic” term is of order 


hyperfine ^ </ 0 6r 

geodetic 3(M @ G) ZI1 R @ 1/2 


6.5 x 10“ 3 


(9.6.20) 


so the main effect is a precession of the spin around the orbital angular momentum 
h, with average rate 

<|il|> - — 1 ^ 8.4 ("^®V /2 sec/vear (9.6.21) 

2 r 5/z \ r ) 

This should be measurable. 7 In order to detect the small “hyperfine” precession, 
one might direct the spin axis of the gyroscope along the direction h normal to the 
plane of the orbit; in this case the terms in ft parallel to h have no effect [see 
Eq. (9.6.11)] so the effective precession is just around : 


<n> e „ = — ® (9.6.22) 

2 r 3 

with magnitude 

|<ft) eff | = 0.055 j sec/year (9.6.23) 

In order to maximize the effect of this tiny precession, one would like to have the 
spin axis of the gyroscope perpendicular to J e ; since it must also be perpendicular 
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to the plane of the orbit, the best arrangement would be to place the gyroscope in 
a polar orbit, with its spin axis parallel to the equatorial plane of the earth. 

As usual, the effect of putting the satellite in an eccentric orbit would simply 
be to replace the radius r everywhere with the semilatus rectum L . It is also easy 
to take into account the effect of a possible departure from Einstein’s field equations. 
The Robertson expansion (8.3.1) for a general static spherically symmetric metric 
in isotropic coordinates gives (with a = 1) 

2 2 3 

9oo = -2 <t> 9u = — 2 V 4>&tj 9i 0 = 0 


where (j) as usual is — GMjr and y is a dimensionless constant that in Einstein’s 
theory would be unity. Referring back to (9.1.18), (9.1.19), and (9.1.21), we see 
that now 


2 

T j 


ik 


2 


r° 


iO 


3 


r J 


iO 


y\ ~ 


8 A S .. - d -*d. 

8x k ' J 8x " 


jk 


+ 


dx j 


8x l 


0 


Using these in (9.6.4) gives for the rate of change of spin 


dt 


— (1 + y)(v • S)V<£ — y(v • V0)S 4- y(S * V0)v 


As before, it is convenient to introduce a spin vector of constant magnitude, which 
now is 

ef = (1 + J(p)S ~ Jv(v • S) 


Again, Sf just precesses about a vector H 


but now 12 is given by 


dt 


12 x Sf 


12 = -(i + y)(v x V0) (9.6.24) 

Thus, the effect of a modification of Einstein’s field equations on the geodetic 
precession is simply to multiply it with a factor 


(1 + 2y) 
3 


In order to calculate the effects on 12 of a modification of Einstein’s field equations 
in a system that is not static and spherically symmetric, it would be necessary to 
know the details of the new theory; we shall return to this problem in Section 9.9. 
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7 Spin ^recession and Mach's Frinciple* 

The spin precession effects calculated in the last section have a remarkable 
interpretation in terms of the ideas of Ernst Mach discussed in Section 1.3. Recall 
that the spin of a freely falling gyroscope does not precess in an inertial coordinate 
system that moves along with the gyroscope; this after all is just the meaning of 
Eq. (9.6.1) of parallel transport. Thus, the precession 12 of a gyroscope in another 
frame, say one fixed to the earth, arises entirely from a rotation of the inertial 
frame carried by the gyroscope with respect to earth and the distant stars, with 
angular frequency 12. This is why 12 does not depend on the rate of the gyroscope’s 
spin ; any vector that keeps a fixed direction in an inertial system will appear to 
precess in the “lab” or earth system with angular frequency 12 given by Eq. 
(9.6.12). 

Why should the inertial frame that falls with the gyroscope rotate with respect 
to the distant stars ? Mach tells us that inertial forces arise from accelerations, 
including rotations, with respect to the total matter of the universe, so a reference 
frame will be inertial if it is not accelerating with respect to some average cosmic 
distribution of matter. Normally this means that the inertial frames do not rotate 
with respect to the distant stars. However, an observer on a gyroscope orbiting the 
earth sees a mass distribution consisting, not only of the distant stars, but also 
of a large sphere called the earth, which appears to revolve around the gyroscope 
once every 90 minutes or so, and which also rotates on its own axis. Thus the 
inertial frames fixed to the gyroscope have to reach some sort of compromise 
between following the distant stars and following the earth; it tries to rotate in 
the same direction as the rotation and apparent revolution of the earth, but lags 
far behind, the distant stars always winning the struggle. 

The rather vague ideas of this sort that are suggested by Mach’s principle 
find their concrete expression in detailed calculations based on the Principle of 
Equivalence. We saw in Eq. (9.6.19) that the precession of an orbiting gyroscope, 
and hence the rotation of the inertial frame it carries, consists of a small “geodetic” 
term parallel to the brbital angular momentum h, and an even smaller “hyperfine” 
term parallel to the component of the earth’s spin perpendicular to h. Thus 
the rotation and apparent revolution of the earth about the gyroscope do seem 
slightly to drag along the inertial frame that falls with the gyroscope. 

This effect may be seen more clearly in a thought -experiment discussed by 
Lense and Thirring 8 shortly after the advent of general relativity. They considered 
a hollow spherical shell that rotates rigidly with angular velocity a). According 
to Eq. (9.4.35), the metric field £ inside the sphere is 

£ = x x n 

where 

O = -4 (f)~ 

3 

* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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with cj) the constant gravitational potential inside the sphere 


( t > = — 4nG 


% o 

T 00 {r’y dr' 

shell 


Equation (9.6.12) then tells us that any inertial frame inside the sphere rotates 
with angular velocity £1. 

We note that H is parallel to co, but smaller by the dimensionless factor 
— 4</>/3. It is therefore interesting to ask what would happen if the shell were made 
so massive that approached a value of order — J. Would the inertial frames inside 
the shell decouple entirely from the distant stars and follow the shell, rotating with 
frequency co ? (We catch an echo of Mach’s remark about Newton’s water bucket 
experiment quoted in Section 1 .3, “No one is competent to say how the experiment 
would turn out if the sides of the vessel increased in thickness and mass until they 
were several leagues thick.”) Unfortunately, the post-Newtonian method breaks 
down just when this problem becomes interesting, when |0| is of order unity. 
An exact solution of Einstein’s equations that looks like the metric outside a 
spinning sphere has been found by Kerr; 9 it is of the form 


— dr' 


— dt 2 -f dx 


(p 4 + (x • a ) 2 )(p 2 + a 2 ) 2 
x [p 2 x ■ dx + p dx ■ ( a x x) + (a • x)(a • dx) + (p 2 + a 2 )p dt]' 


where x is a quasi-Euclidean three- vector; a is a constant vector; scalar products 
x • a, x 2 , and so on, are defined as in Euclidean geometry; and p is defined by 


p 4 — (r 2 — a 2 )p 2 — (a • x) 2 = 0 


where as usual, r 2 = x 2 . For r ^ oo, we have p — ► r, and the metric coefficients 
become 


9oo ~ 1 + + O 

r 


2MG f 1 , J ^ (l 

So x x >‘ + ohs 


Sij + 


2MG 


T" XiX J 


+ O 



A straightforward calculation, using Eqs. (7.6.22)-{7.6.24), shows that the total 
momentum, energy, and angular momentum of the system and its gravitational 
field are 

P = 0 
P° = M 
J = Mu 
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Unfortunately, it has not yet been possible to show that this exact exterior solution 
fits smoothly to an exact solution inside a spinning sphere. Recently Brill and 
Cohen 10 have found a solution for a very thin rotating spherical shell, which is 
valid both inside and outside the shell to lowest order in the rotation frequency 
co but to all orders in the shell mass M, and which satisfies the correct continuity 
conditions across the shell radius R. The solution is 

— dr 2 = — H(r ) dt 2 + J(r) [dr 2 + r 2 dO 2 + r 2 sin 2 9 (dtp — ft(r) dt ) 2 ] 


where 


{ f \ - 2MGIr \ 2 

( 1 - 2 MGjRV 
0 + 2MGjR ) 
= {(1 + 2 MO/r)* 
((1 + 2MGjR ) 4 


(r > R) 


(r < R) 

(r > R) 
(r < R) 


Inside the sphere the angular velocity ft(r) is a constant 


ft 


r 3(i? - 23/0) 1 " 1 

[ 431 0(1 + /?)_ 


(r < R) 


with p a dimensionless constant that depends on the relative contributions of 
T lJ and T 00 to the shell’s gravitational mass. We get an inertial coordinate system 
inside the sphere if we define new coordinates 


t' — \j H t r' = sjj r (p r = (p — ft£ 


so ft is the rotation frequency (in t units) of the inertial frames within the shell 
with respect to the Minkowski metric at infinity. When MG is small and ft is 
small ft/co approaches the post-Newtonian value 4MGj^R t but when MG is so 
large that the Schwarzschild radius 2 MG of the shell approaches the shell radius 
R the ratio ft/co approaches unity, as Mach might have expected. 


8 Post-Newtonian Hydrodynamics* 

The post-Newtonian program outlined in Sections 9. 1-9.3 would form an 
adequate basis for relativistic celestial mechanics if the sun and planets could be 
regarded as point particles. However, this is not the case; for instance, the tidal 
forces on the moon due to its finite size are very much larger than the effects of the 
post-Newtonian corrections to the earth’s gravitational field. Often such finite- 
size effects may be calculated to sufficient accuracy if we treat astronomical bodies 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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as composed of perfect fluids. 1 1 The energy-momentum tensor is then given by 
Eq. (5.4.2): 

T» v = pg ^ + {p + p)U»U v (9.8.1) 


where p and p are the proper pressure and energy density, that is, those measured 
by a locally comoving and freely falling observer, and EP* is the four- vector velocity 
dx^jdr. (Of course, p and p vanish except within the sun and planets.) To calculate 
we set 


U l _ dxP 
U° ~ ~dt 


(9.8.2) 


and we calculate U° from (9.2.2) : 

u° = - = 1 - <t> + w + 0(V 4 ) (9.8.3) 

dr 


The program for calculating the motion of the fluid depends crucially on 
whether there is an equation of state giving p as a function of p, as is the case for 
the cold degenerate fluids studied in Chapter 11, or whether p depends on tem- 
perature as well. If the pressure is a function of p alone, then our program is 
essentially the same as in Section 9.3: 

(A) First solve the Newtonian problem. The pressure is to be regarded as of 
order v 2 Mjf 3 , so the necessary components of the energy- momentum tensor are 
given by (9.8.1)-(9.8.3) as 

0 

T 00 = p (9.8.4) 

1 

T i0 ^ pv t (9.8.5) 

2 

T ij = pdij + pvpj (9.8.6) 

The Newtonian equations of motion are provided by using (9.8.4)-(9.8.6) in the 
mass- and momentum-conservation equations (9.3.2) and (9.3.3): 

^ + V • (pv) = 0 (9.8.7) 

dt 

p\ 

— (pv) + V • (pw) = — p\(j) — Xp (9.8.8) 

dt 

with p given as a function of p by the equation of state, and with (p determined 
by Poisson’s equation (9.3.12): 

X 2 <p = ±nGp (9.8.9) 

(B) Use the values of p, p, v, and (j) determined in (A) to calculate the terms 
(9.8.4)-(9.8.6) of T" v , and also to calculate 

2 

T 00 = p(v 2 - 2(p) 


(9.8.10) 
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(C) Use the results of (A) and (B) and Eqs. (9.1.62) and (9.1.65), to compute 
the post-Newtonian fields £ and 0. 

(D) Solve for p, p, and v in the post-Newtonian approximation. The energy- 
momentum tensor is given to the necessary order by (9.8.1)-(9.8.3) as 

0 2 

T oo + /poo = p(1 + v 2 _ 2 0) (9.8.11) 

1 3 

T i0 + T i0 = (p + p + v 2 p _ 2 0p) v (9.8.12) 

2 4 

T lJ + rpij _ pd.j(l + 20) + v l v j (p -f p — 2 <j>p + 0v 2 ) (9.8.13) 

and the post-Newtonian equations of motion are obtained by using (9.8.11)- 
(9.8.13) in the energy and momentum conservation equations, (9.3.2) plus (9.3.4) 
and (9.3.3) plus (9.3.5): 

l [p(l - V 2 - 2 </>)} + V ■ [v(p + p + v 2 p - 2 0p)] = p Q (9.8.14) 

dt dt 

~ [v{p + p + v 2 p - 20p)] + V • [vv{p + p - 20p + 0v 2 )] 
ct 

= - V[p(l + 20)] - pV(0 + 20 2 + 0) - p J 

dt 

— p(v 2 — 2 0)V0 + pv x (V x £) + 4pv — 

dt 

— + pv 2 )V0 + 4 p\(j) + 4pv(v • V0) (9.8.15) 

(E) And so on. 

Matters are more complicated when the temperature is an independent 
variable. We now need one additional equation at every stage in the calculation, 
which is provided for us by an equation of continuity 

±y 9 pU>‘)= 0 (9.8.16) 

where p is a rest-mass density proportional to the number density of particles in 
the fluid. [Compare Eq. (5.2.14).] It may be assumed that the pressure is given 
by the equation of state as a function of both p and an energy density e = 0(v 2 ) 
defined by 

T 00 = pU° + e (9.8.17) 

Our equations are then the continuity equation (9.8.16), the momentum-conserva- 
tion equation ( T = 0, and an energy- conservation equation, which after 
subtracting (9.8.16) may be written 

+ h 4k T ‘° - ^ 

ct dx l 


(9.8.18) 
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However, now we use the energy- conservation equation to one higher order in 
at every step: In the Newtonian approximation we use the continuity equation 
to order v, the momentum -conservation equation to order v 2 , and the energy- 
conservation equation to order v 3 , while in the post-Newtonian approximation 
we use continuity to order v 3 , momentum conservation to order v 4 , and energy 
conservation to order v 5 . Without writing out these equations in detail we may 
note that this program is possible, because in the Newtonian calculation we need 

3 2 

r° 00 and r° f0 , which are given by (9.1.71) and (9.1.72) in terms of (p alone, while 

5 4 3 

in the post-Newtonian calculation we also need r° 00 , r° i0 , and T 0 ^-, which are 
given by (9.1 .73)— (9.1.75) in terms of the post-Newtonian fields. 


9 Approximate Solutions to the Brans-Dicke Theory 

In order to test general relativity, it is useful to have in mind some other theory 
with which to compare it. The Brans-Dicke theory described in Section 7.3 is 
identical with general relativity in the physical interpretation of the metric 
g MV , and differs only in that a new scalar field cp enters the gravitational field 
equations. In order to avoid confusion with the Newtonian potential, we shall 
write the Brans-Dicke scalar field (p as + £), with ^ a constant of order G. 

and £ a scalar field defined by 


t ^ = 


8 tt ^ 


3 + 2a) 


rpft 


(9.9.1) 


f ->• 0 for r -> oo 


(9.9.2) 


(See Eq. (7.3.13). We have dropped the subscript M, but T^ v should be understood 
as the energy-momentum tensor of matter, excluding £. Also, co is a dimensionless 
constant, perhaps of order 6.) The gravitational field equations are given by Eq. 
(7.3.14) as 


- i = -8n»(l + 

- 0,(1 + - i 

- (1 + - S (9-9.3) 

By using (9.9.1) to determine f. p . p , and contracting (9.9.3) to find R, we can 
rewrite this in the form 


R 


fiv 


= -8ti^(i + a -1 




- 0,(1 + - (1 + 


( 9 . 9 . 4 ) 
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it follows from (9.9.1) and (9.9.2) that § has the expansion 


2 4 

£ = { + £ + 


(9.9.5) 


where £ is or order v N , and in particular 


V 2 £ - 


8 tc^ 


3 + 2m 

The field equations are now given by (9.9.4)-(9.9.6) and (9.1.37)-(9.1.40) as 

'2m + 4\ ° 


(9.9.6) 


▼ <7oo = “ 87 c# 


2m - 4 - 3 
2 


T t 


00 


Woo = ~ + 9u - (Wo) 2 


dt J 


+ 8 rc# 


/2m + 4\ 2 ^ 00 n ^ ;| /2m + 2 


\2m + 3 
2 




Sn^T 1 


-(fj 


2co + 3. 

e 2 


8 ^ | T oo 

\2a> + 3, 


st 2 00 ax' 


3 1 

V 2 <? i0 = 167i^T i0 - 2 
10 5a;' dt 


• 2 l = _8^y°°a iJ Y^i4 N \ - 2 -W-. 

2 2 \2a) + 3/ 8x'8x> 


(9.9.7) 


(9.9.8) 

(9.9.9) 

(9.9.10) 


From (9.9.7) it follows that the gravitational constant measured by observation 
of slowly moving particles or in time dilation experiments is not but rather is 


2m -f 4 
2m -f- 3 


(9.9.11) 


2 

That is, we have the usual relation between g 00 and the Xewtonian potential </> 


provided we define <f> by 


2 

?oo = -20 


V 2 0 = 4 nGT 00 


(9.9.12) 


(9.9.13) 
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Also, it follows from (9.9.6) and (9.9.13) that 


43 2 

and the field equations for g 00 , g iQ , and g.. are 


- 2 (* + iy4> 

V9o °- [co + 2 8 t> 


( = -(co + 2) 'cf> 

j are 

d 2 (j) 


ij 

2 

%9i 


ij 


ox 1 ox J 


2co + 5 \ /rr t x 2 


CO + 2 


CW 


— 87l(2 


4 + — — 1 4>T 00 - SnG (^±1\ 
co + 2j \2co + 4 J 


8nGT 00 


2 co 


d<j) x 2 

2 '¥ 


vV i0 = 1671 g( 2o) + 3 \ r ° 


d 2 (p 


(co + 2)' 

1 

,° + 

\^2co + 4^ co + 2 5a; 1 dt 

2 5 2 <£ 


V 2 y l7 = -87rG7 ,00 5 ; ,./ , 5^^ | + 


CO + 2/ (co + 2) 5a; 1 5a;- 7 


(9.9.14) 

(9.9.15) 

(9.9.16) 


As an example, let us consider the field of a static spherically symmetric 
mass. The Newtonian potential is then a function of r alone, and (9.9,16) gives 


2 s /co + 1\ , 2 

- “ 2 <>ij ( I + 


o) -f- 2 J co -f- 2 

Outside the mass, we have 


5 .._^ur 

,J - 2 A 3 Jo 


— r 2 <j)(r) dr + 


x i x ,4 


<t> = - 


MG 


and so (9.9.17) gives 


2 '2m + 1 \ MG . MG x,x, 

9a = I r- I-- - hi + 


co + 2 / r 


2MGR : 

_l_ 

co -|- 2 

where R is an effective radius, defined by 
MGR 2 = 


S:,. - 


co -f 2 r* 


3#, -a; A 1 


r 2 / r 3 


,, , 2 , 
<p(r) H dr 

r 


(9.9.17) 

(9.9.18) 


(9.9.19) 


(9.9.20) 


(The integrand vanishes outside the mass, so we are free to change its upper limit 
from r to co.) Using (9.9.18) and (9.9.19) in (9.9.14) gives 

v2 4 _ _ 2(2co + 3 )M 2 G 2 __ 24:M 2 G 2 R 2 

900 ~ (co + 2 )r* (co + 2)r 6 



9 Brans-Dicke Theory 


247 


The solution is 


(2a) + 3 )M 2 G 2 _ 2 M 2 G 2 R 2 kM 2 G 2 

(co + 2 )r 2 (co + 2)r 4 ri? 


(9.9.21) 


where k is a dimensionless constant, which must be determined by the condition 
that the exterior solution (9.9.21) fit smoothly onto a nonsingular internal solution. 

The results (9.9.19)-(9.9.21) make it appear that the gravitational field outside 
a spherical static mass depends upon the size and distribution of the mass. How- 
ever, this size-dependent effect can be eliminated by a suitable redefinition of 
M and x 


M' = M 
x' = xp + 


_ kMGI 

~T] 

MOB 2 ' 
(co + 2 )r 3 


(9.9.22) 

(9.9.23) 


The last two terms in (9.9.21) and the last term in (9.9.19) are then cancelled by 


the changes in g 00 and g tj , and so, dropping primes, we now have 


2 2MG 
9 00 = 


(9.9.24) 


4 

9oo 


2 


_ (2co + 3 )M 2 G 2 
(co + 2)r 2 

' 2co + 1 \ MG MG x iXj 

co + 2 J ij + w~^2 7 1 " 


(9.9.25) 

(9.9.26) 


Thus the Brans-Dicke theory shares the property of the Einstein theory, that the 
gravitational field outside a static spherically symmetric mass depends on M, 
but not on any other property of the mass. 

This solution may be compared with the general Robertson expansion (8.3.7) 
in harmonic coordinates, which (with a = 1) gives 


2 2 MG 

9oo = 

r 


4 

9o 0 


(y - 1 + m 


M 2 G : 


2 

9ij 


_ MG 

(Zy - 1)5, ■ + (1 - 7) 

r 


MGXjXj 


Thus the Brans-Dicke results (9.9.24)-(9.9.26) can be summarized by giving 
formulas for the Robertson parameters 


co + 1 
co + 2 


p = 1 


( 9 . 9 . 27 ) 
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These formulas were already used in the last chapter to compare the Brans-Dicke 
theory with experiment. 

3 

We also note that the element g i0 = of the metric tensor is given for a 
static system by Eq. (9.9.15) as 


Ct = -4 0 


2o) -f- 3 
2co + 4 


\ | 4*y, t ) 

/ J l* - *'l 


dV 


(9.9.28) 


Thus the effects of the rotation of a spherical mass on the precession of spins and 
perihelia are smaller in the Brans-Dicke theory (for 0 < co < oo) than in general 
relativity, by a factor (2co -f 3 )/(2<n + 4). 

By far the most dramatic tests of the Brans-Dicke theory are those that also 
test the “very strong” Principle of Equivalence. At any point P in a gravitational 
field we can choose a locally inertial coordinate system, for which g^ v — rj and 
r£v ~ 0 at that point. However, the Brans-Dicke field £ is a scalar, and hence will 
not vanish at P, being given by Eq. (9.9.6) and (9.9.13) as 

{ ~ ^ = -(a + 2 ) _ 1 4 > 


where (j) is the Newtonian gravitational potential. Equation (9.9.4) shows that in 
this coordinate system, the gravitational field of a small mass introduced at P 
can be calculated as usual, but with the gravitational constant 0 replaced with 

£ eff = G( 1 + a -1 ^ 0[ 1 + (co + 2)~ 1 (j)] (9.9.29) 

For instance, with co = 6 and (f> at the surface of the earth equal to —6.9 x 10“ 1 °, 
the effective gravitational constant measured by Cavendish experiments on the 
surface of the earth is smaller than the “true” gravitational coupling constant that 
would be measured on a satellite in a high orbit by a factor [1 — 8 x 10 — 1 1 ]- 
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“It is the nature of all 
greatness not to be exact.” 
Edmund Burke, speech on 
American taxation, 1774 


IO GRAVITATIONAL 
RADIATION 


We have seen a great many similarities between gravitation and electro- 
magnetism. It should therefore come as no surprise that Einstein’s equations, like 
Maxwell’s equations, have radiative solutions. 

No one has yet certainly detected gravitational radiation, but the reason for 
this is not hard to find; Einstein’s theory predicts that gravitational radiation is 
produced in extremely small quantities in ordinary atomic processes. For instance, 
the probability that a transition between two atomic states will proceed by emission 
of gravitational, rather than electromagnetic, radiation is typically of order 
GE 2 je 2 , where E is the energy released and e is the electronic charge. For E ~ 1 eV, 
this probability is about 3 x 10“ 54 . 

Why then study gravitational radiation? One reason is of course that some 
day we may find a strong source of gravitational radiation. Such a source may 
indeed already have been detected. (See Section 10.7.) However, gravitational 
radiation would be interesting even if there were no chance of ever detecting 
any, for the theory of gravitational radiation provides a crucial link between general 
relativity and the microscopic frontier of physics. 

We have learned in recent years to describe the fundamental observables of 
microscopic phenomena in terms of elementary particles and their collisions. In 
classical electrodynamics it is the plane-wave solutions of Maxwell’s equations that 
lead most naturally to an interpretation in terms of a particle, the photon. Similarly, 
it is the radiative solutions of Einstein’s equations that will lead here to the concept 
of a particle of gravitational radiation, the graviton. 

Unfortunately, the theory of gravitational radiation is complicated by the 
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nonlinearity of Einstein’s equations. In the spirit of Section 7.6, we may say that 
any gravitational wave is itself a distribution of energy and momentum that 
contributes to the gravitational field of the wave. This complication prevents our 
being able to find general radiative solutions of the exact Einstein equations. 

There are two approaches to this difficulty. One is to study only the weak- 
field radiative solutions of Einstein equations, which describe waves carrying not 
enough energy and momentum to affect their own propagation. The other approach 
is to look long and hard for special solutions of the exact Einstein field equations. 
A great deal of mathematical ingenuity has gone into the second approach, with 
results of some elegance. However, this chapter deals with only the first, weak- 
field, approach to gravitational radiation. One reason is that any observable 
gravitational radiation is likely to be of very low intensity. A second, deeper, 
reason is that it is only possible to attach a precise meaning to the concept of an 
elementary particle when it is far away from all other particles, and for gravitons 
this corresponds to a weak-field solution of the field equations. 

The reader should not conclude that there is any fundamental gap in our 
understanding of gravitation because we cannot find general exact solutions of the 
nonlinear field equations. Indeed, similar problems arise in electrodynamics: 
The problem of computing the exact electromagnetic field produced by a decaying 
current in an electrical oscillator is highly nonlinear, because the field acts back on 
the current that produces it. Even though this problem was not solved for many 
years after Maxwell’s theory, still there was no doubt that electrical oscillators 
would produce the electromagnetic waves studied by Maxwell. Gravitational 
waves are more complicated than electromagnetic waves because they contribute 
to their own source outside the material gravitational antenna. However, the 
simple properties of both electromagnetic and gravitational waves emerge when 
we look far out into the wave zone, where the fields are weak. 


1 The Weak-Field Approximation 


We suppose the metric to be close to the Minkowski metric r\ \ 

5/^V 'IrV 

where \h^ v \ 1. To first order in h, the Ricci tensor is then 

r.... =* — ri.. - Ari, + o (h 2 ) 


>v dx’ ‘ ^ dx x ‘ 


and the affine connection is 

rt = i^ p 


‘_d_ 

dx p 


kpv + kpp 


dx p 




K v + CK^ 2 ) 


( 10 . 1 . 1 ) 


( 10 . 1 . 2 ) 


(10.1.3) 
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As long as we restrict ourselves to first order in h , we must raise and lower all 
indices using rj not g pv ; that is, 

^ etc - 

With this understanding, Eqs. (10.1.2) and (10.1.3) yield the first-order Ricci 
tensor 


R.. 


R[\} 


D\ 


dx x dx * 


- 


dx x dx v 


h\. + 


dx p dx 1 




The Einstein field equations therefore read 
d d 2 




dx x dx» 


h' 


dx x dx v 


h u + 




dx p dx v 

i % V T X X 


h\ = 


(10.1.4) 

(10.1.5) 


Here T^ v is taken to lowest order in & so it is independent of ^ v . and satisfies 
the ordinary conservation conditions 


rs 

— T\ = 0 (10.1.6) 

dx* 1 


(If gravitational forces play an important role in the structure of the radiating 
system, then r #,v should be used in place of T pv ; see Section 7.6.) Note that it is this 
form of the conservation law that is needed for the consistency of (10.1.4), because 
(10.1.6) implies 

A s\, = i —S\ 

dx " 2 dx y 


whereas the linearized Ricci tensor satisfies Bianchi identities of the form 

A*<i» = IAf n v _ i^ (1 A 

8x " v 2 dx v A dx* dx y \ 2 dx v 

As already discussed in Section 7.4, we cannot expect a field equation such as 
(10.1.4) to yield unique solutions, because given any solution, we can always 
generate other solutions by performing coordinate transformations. The most 
general coordinate transformation that leaves the field weak is of the form 

X* _> x '» = x n + S H( X ) ( 10 . 1 . 7 ) 


where ds^/dx* is at most of the same order of magnitude as h #lv . The metric in the 
new coordinate system is given by 


dx tp dx' 
dx x dx p 
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or, since g pv ~ tj pv — h 


h ' pv - hr v 


d e p 

a?" 


Av 


de_ v 

dx p 


Thus, if h^ v is a solution of (10.1.4), then so will be 

= h ~ — - — 

8x v dx“ 


( 10 . 1 . 8 ) 


where = s v tj tiv are four small but otherwise arbitrary functions of x p . That this 
is the case can be verified by direct inspection of Eq. (10.1.4); this property is 
called the gauge invariance of the field equation. 

The gauge invariance of Eq. (10.1.4) is a nuisance when it comes to actually 
solving the field equations. However, the difficulty can be removed by choosing 
some particular gauge, that is, coordinate system. The most convenient choice is 
to work in a harmonic coordinate system , for which 

/TJ, = 0 


Using (10.1.3), this gives to first order 



1 d_ 
2dx 1 


w. 


(10.1.9) 


That this choice is always possible follows from the general argument of Section 
7.4; it can also be seen from (10.1.8) that if h does not satisfy (10.1.9), then we 
can find an that does, by performing the coordinate transformation (10.1.7) 
with 


D\ = 


dx p 


h p . - 


l_d_ 

2dx 1 


hr. 


It will be assumed from now on that h^ v does satisfy Eq. (10.1.9). 

Using (10.1.9) in (10.1.4), the field equation now read 

u%, = -1671 GS^ (10.1.10) 

One solution is the retarded potential 


A„ v (x, l) = 4G f d 3 x' ^4 |X „ X>l> (10.1.11) 

J I* - x I 

We have already remarked that the conservation law (10.1.6) for T pv is equivalent 
to 


— <S\ = - — 8\ 

dx* 2 8x v * 


( 10 . 1 . 12 ) 


and in consequence the solution (10.1.11) for a source *S^ V confined to a finite volume 
automatically satisfies the harmonic coordinate conditions (10.1.9). (The proof is 
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identical to that used in electrodynamics in the calculation of the vector potential 
in Lorentz gauge.) To (10.1.11) we can add any solution of the homogeneous 
equations 


□ 2 A = 0 


(10.1.13) 



I A 

2dx v 


h* 


(10.1.14) 


We interpret (10.1.11) as the gravitational radiation produced by the source S flY , 
whereas any additional term satisfying (10.1.13), (10.1.14) represents the gravita- 
tional radiation coming in from infinity. The occurrence in (10.1.11) of the time 
argument t — |x — x'| shows that gravitational effects propagate with unit 
velocity, that is, with the speed of light. 


2 Plane Waves 

We now consider the plane- wave solutions of the homogeneous equations 
(10.1.13) and (10.1.14), both because they are important in their own right and, 
as we shall see, because the retarded wave (10.1.11) approaches a plane wave as 
r -> 00 . The general solution of (10.1.13) and (10.1.14) is a linear superposition of 
solutions of the form 


*„»(*) = 

e„ v exp (ik^) + e* y exp (- ik & 1 ) 

(10.2.1) 

This satisfies (10.1.13) if 

k^ = 0 

(10.2.2) 

and satisfies (10.1.14) if 

] U pH — Jib pv 

K n e v — 2 /e v e ^ 

(10.2.3) 

(Of course we are still raising and lowering indices with so that k ** 

The matrix e^ v is obviously symmetric: 

ill 

-s 

< 


6 fiv e vfi 

(10.2.4) 


It will be called the polarization tensor. 

A symmetric 4x4 matrix would in general have ten independent components, 
and the four relations (10.2.3) would lower this number to six, but of these six only 
two represent physically significant degrees of freedom. By a change of coordinates 
x* -> x* + s^ix) we can transform the metric -f into a new metric -f 
h'^ with /q, v given by (10.1.8). Suppose that we choose 


e fl (x) = exp {ik x x x ) — is* 1 * exp { — ik k x k ) 


(10.2.5) 
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Then (10.1.8) gives 

Kv( x ) = e 'nv ex P + C exp (-^ A ) (10.2.6) 

where 

e 'fiv — e fiv + kfi £ v + k v £ n (10.2.7) 

[Note that the wave still satisfies the harmonic coordinate condition (10.2.3).] 
We conclude that e* and e represent the same physical situation for arbitrary 
values of the four parameters s so of the six independent e ’s satisfying (10.2.3) 
and (10.2.4), only 6 — 4 = 2 are physically significant. For instance, consider 
a wave traveling in the -f 2 -direction, with wave vector 

k 1 = k 2 = 0 k 3 = k° = k > 0 (10.2.8) 

In this case (10.2.3) gives 

e 31 + e 01 ~ e 32 + e 02 = 0 
€ 33 + e 03 ~ — e 03 ~ e 00 = i( e ll + e 22 + e 33 — e Oo) 

These four equations allow us to express e i0 and e 22 in terms of the other six e /iV : 

£ 01 == ” e 31 5 e 02 = — e 32*> e 03 = — l( e 33 + e 0C)) > e 22 = ~ e ll 

(10.2.9) 


When the coordinate system is subjected to the transformation defined by (10.1.7) 
and (10.2.5), these six independent components of e flv change according to Eq. 
(10.2.7): 


11 




e l2 

= e l2 

/ 

13 

— e i 3 

+ 

ks^ , 

£ 23 

“ e 23 

/ 

33 

= e 33 

+ 

2ks 1} 

e oo 

o 

o 

1! 


+ ks 2 
— 2 ke 0 


Thus it is only e n and e 12 that have an absolute physical significance. Indeed, 
we can arrange that all components of vanish except for ej 1? ej 2 , and e 22 = 
— e' ll , by performing a coordinate transformation with 


£ i 



£ 2 = 



£ 3 


e 33 _ e CK) 

y *>0 

2k 2k 


The distinction between the different components of the polarization tensor 
is clarified if we ask how e^ v changes when we subject the coordinate system to a 
rotation about the 2 -axis. This is just a Lorentz transformation of the form 


R t 1 = cos 9 
R 2 1 — — sin 0 
R 3 3 = «o° = 1 


R t 2 = sin 6 
R 2 2 ~ cos 6 
other R^ = 0 


(10.2.10) 
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and since it leaves k^ invariant (i.e., R fl v k v = k^), the only effect is to transform 
e MV into 

<v = ^vV ( 10 . 2 . 11 ) 

Using the relations (10.2.9), we find that 

e± = exp ( ±2i0)e+ (10.2.12) 

f± = exp ( + *0)/± (10.2.13) 

e 33 = 6 33 j e 00 = e 00 (10.2.14) 

where 

C + — -f- ^ 6^2 622 “I - ^^12 (10.2.15) 

f± — e 3i + Z ' e 32 = ~ e 01 i ze 02 (10.2.16) 

In general, any plane wave fi, which is transformed by a rotation of any angle 6 
about the direction of propagation into 

j/d = e ik0 \j/ (10.2.17) 


is said to have helicity h. We thus have shown that a gravitational plane wave 
can be decomposed into parts e ± with helicity ±2, parts f ± with helicity +1, 
and parts e 00 and e 33 with helicity zero. However, we have also seen that the parts 
with helicity 0 and ± 1 can be made to vanish by a suitable choice of coordinates, 
so the physically significant components are just those with helicity + 2. 

Once again we find a fruitful analogy with electromagnetism. The Maxwell 
equations in Lorentz gauge are (2.7.12) and (2.7.13); in empty space they become 


D 2 A = o 


8 A 1 

dx x 


= 0 


in analogy with Eqs. (10.1.13) and (10.1.14) for the metric in harmonic coordinates. 
(We are now in an inertial coordinate system, and so Q 2 = rj ^ c 2 ldx a dx p .) We 
can find a plane- wave solution of the form 

A a = e a exp {ikpx fi ) + e* exp (~ik p x p ) 

where 

kjc a = 0 
&_e a - 0 


in analogy with Eqs. ( 10.2. l)-( 10.2.3). 

In general e a would have four independent components, but the condition 
that & a e a vanish reduces the number of independent components to three, just as 
(10.2.3) reduces the number of independent components of e^ v from ten to six. 
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Furthermore, without changing the physical fields E and B and without leaving the 
Lorentz gauge, we can change A a by a gauge transformation 


A' m = A m + 


dx„ 


$(;r) = is exp {ik p xP) — is * exp ( — ik p x p ) 
in analogy with (10.1.8) and (10.2.5). The new potential can be written 
= e' x exp {ik p x p ) + e ** exp (- ih p x p ) 
e' = — sk„ 


in analogy with (10.2.6) and (10.2.7). The parameter s is arbitrary, so of the three 
algebraically independent components of e a only 3—1=2 are physically 
significant, just as general covariance renders only two of the six independent 
components of e^ v physically significant. To identify the two significant components 
of e a , we may consider a wave traveling in the s-direction, with Jc a given by Eq. 
(10.2.8). Then the condition that k a e a vanish allows us to determine e°, 


6 0 — ~ e 3 

just as (10.2.3) allows us to determine e 22 and e oi in terms of the other six e 
Also, the preceding gauge transformation leaves e 1 and e 2 invariant but changes 
e 3 into 

e 3 = c 3 — sk 

Hence e 3 can be made equal to zero by choosing e = e 3 jk, so it is only e 1 and e 2 
that carry physical significance, just as it was only e n and e 12 that could not be 
made equal to zero by a suitable coordinate transformation. Finally, we can work 
out the meaning of these two components by subjecting the plane electromagnetic 
wave to the rotation (10.2.10). The polarization vector is then changed into 

e a = 

and therefore 

e' ± = exp ( ± i6)e ± 

e 3 = e 3 

where 


= Ci + 

Thus the electromagnetic wave can be decomposed into parts with helicity ± 1 
and 0. However, the physically significant helicities are ± 1, not 0, just as for 
gravitational waves they are + 2, not + 1 or 0. This is what we mean when we say, 
speaking classically, that electromagnetism and gravitation are carried by waves 
of spin 1 and spin 2, respectively. 
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3 Energy and Momentum of .Plane Waves 

The physical significance of the plane-wave solution (10.2.1) is brought forward 
by calculating the energy and momentum it carries. According to Eq. (7.6.4), 
the energy-momentum tensor of gravitation is given to order h 2 by 

- A; + K? - 

57T0r 

where Rffi is the term in the Ricci tensor of order N in h . The metric g = 
tfnv + V satisfies the first-order Einstein equations R^ff — 0, so we can drop 
these terms in t and use 

<„v “ A [R $ - < i0 - 31 ) 

07TCr 

(For the actual metric it is R^ v rather than R^ff that vanishes, and t tlv arises only 
from the first-order terms in Eq. (7.6.4). Here it is rather than R^ v that vanishes, 

because g — rj^ v + h flv satisfies the first-order Einstein equations rather than the 
exact equations. The difference is only of order h 2 .) To calculate Rf$ we must use 
Eq. (10.2.1) in Eq. (7.6.15); the result is extremely complicated, but can be 
simplified if we average £ over a region of space and time much larger than 
|k| _1 . (This is the way the energy and moment urn of any wave are usually 
evaluated.) The averaging kills all terms proportional to exp (±2 ik x x x ), and we 
are left with only the ^-independent cross- terms : 

(lift} = Re {e lp *[k„k v e lp - kje x e vl> - + kje^ 

+ [ eA A - i e / i p]*[V'’v + V'* - ^,] 

- ifov + K%X - + v pA - < 10 - 3 - 2 ) 

(We have not yet made use of the conditions (10.2.2) and (10.2.3) appropriate 
to harmonic coordinates, so suppose for a moment that we leave the harmonic 
coordinate system by adding to h^x) a term 

+ ?v £ w ) ex P (*?***) - i(W* + ?v £ v) ex P (10.3.3) 

where qyf # 0. After averaging over space- time distances large compared with 
|g — k\~ l , the interference between (10.2.1) and (10.3.3) drops out, and we find 
for (R^y the term (10.3.2), plus another term obtained by replacing k with q and 
e with q^8 v + q v 8^- Inspection of (10.3.2) shows immediately that this second 
term vanishes, so (Rtfjy and hence may be calculated in harmonic coordinates 
with no loss of generality. ) 

If we now use in (10.3.2) the harmonic coordinate conditions (10.2.2) and 
(10.2.3), we find 

(R^y = (e^*e Xp - i\e\\ 2 ) 


(10.3.4) 
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The quantity vanishes because k p k p = 0, so (10.3.1) now gives the 

average energy- momentum tensor of a plane wave as 

<*„V> = ^ <^%, - iletl 2 ) (10.3.5) 

Note that a “gauge transformation” (10.2.7) will change the terms in (t pK y into 

e fXp *e' Xp = e Xp *e kp + 2 Re e*k p e\ + 2\s p k p \ 2 
e a x = e A A + 2kh k 

but (t y is gauge-invariant! Thus, as far as energy and momentum are concerned, 
the polarizations e pv and e pv + k p g v + k v £ p represent the same physical wave, and 
we see again that there are not six but only two physically significant polarization 
parameters. In particular, a wave traveling in the ^-direction, with wave vector 
and polarization tensor given by (10.2.8) and (10.2.9), has the energy-momentum 
tensor 

(fnv) = g^(l e nl 2 + K 2 I 2 ) (10.3.6) 

or, in terms of the heiicity amplitudes (10.2.15), 

<*„> = + M 2 ) (10.3.7) 


4 Generation of Gravitational Waves 


We wish to calculate the energy emitted in the form of gravitational radiation 
by a system whose energy-momentum tensor can be expressed as a Fourier integral, 


^v( x > 0 


* GO 

dcoT px (x, co)e~ ia,t + c.c. 
0 


or as a sum of Fourier components, 


T px {x, t) = £ e ia % v (x, ay) + c.c. 

OJ 


(10.4.1) 


(10.4.2) 


(Here “-f c.c.” means “plus the complex conjugate of the preceding term.”) We 
first do the calculation for a single Fourier component, 


T^(x,t) = T pv (x, aj)e~ i01t 


+ c.c. 


(10.4.3) 
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and will then return to the more general systems described by (10.4.1) and (10.4.2). 
From (10.1.11) we find that the -field emitted by the source (10.4.3) is 


^v( x > 0 = 

where 




d x 


x'| 


S (x' co) exp { — loot + io\x — x'|} + c.c. (10.4.4) 


S^(x, co) = T^(x, co) - co) 


(10.4.5) 


Suppose that we observe this radiation in the wave zone, that is, at distances 
r = |x| much larger than the dimension R = |x'| max of the source, and also much 
larger than cOj E 2 and 1/m. Then the denominator |x — x'| can be replaced with r, 
while in the exponent we may approximate 


x — x'| ~ r — x' • x 


and the field becomes 


„ x 
x = - 


r 


/^ v (x, t) = — exp ( icor — icot) J* rf 3 xhSy v (x', co)e iC)x 

Since rco is assumed large, this looks just like a plane wave, 
\v( x > t) = e^ v (x, co) exp {ik^) + c.c. 
with “wave vector” and “polarization tensor” given by 


4 G 

e^fx, co) = — 
r 


k = cox k° = co 

d 3 x'S^ v {x', co)e~ ik ' x ' 


+ c.c. 


(10.4.6) 


(10.4.7) 


(10.4.8) 

(10.4.9) 


It will be convenient to write e^ v explicitly in terms of the Fourier transform of 
T : 

)l V * 

AQ 

e„ v (x, co) = - [r„ v (k, co) - co)] (10.4.10) 

r 

^ v (k,co)= ( d 3 x'T^ y (x', a>)e~ lk ' x ' (10.4.11) 

The conservation equation for T^ v (x, t) is 

h r " v(x ’ 0 = 0 


Applying this to (10.4.3) gives 


— . T\(x, co) — ia)T° v (x, co) = 0 
dx l 
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Multiplying with e lk ' x and integrating over x, we find that T y ( k, co) is subject to 
the algebraic relations 

yF\(k, co) = 0 (10.4.12) 


where k* 1 is the vector given by (10.4.8). This, incidentally, verifies that (10.4.10) 
obeys the harmonic coordinate condition (10.2.3). 

Now let us calculate the power per unit solid angle emitted in a direction x. 
Since r > l/co, we can use for the energy flux vector the value (j i0 ) obtained by 
averaging over space-time dimensions large compared with 1 /co, so that the power 
per solid angle is 

- P = r 2 £‘{t >0 ) 
dQ. N 7 


We use for <(^ v ) the value (10.3.5), so this gives 


dP 

dQ 


r 2 (k • x)k° 
1 6kG 


[e iv *(x, w)e Av (x, co) 


iVaIx, co)| 2 ] 


and inserting the values (10.4.8) and (10.4.10) for kf and e Av , we find that the r 2 
factors cancel, and 

~ = — [2^*(k, to)r Av (k, co) - i|T\(k, co) I 2 ] (10.4.13) 

dQ 7t 


The problem is thus solved once we have calculated the Fourier transform (10.4.11). 

It will be convenient to express this result in terms of the purely spacelike 
components of T Av ( k, co). From (10.4.12) we have 

r 0l (k, CO) = -PT n ( k, co) 

r o o(k. co) = fcyTjti k, co) 

where k = k/co = x. Using this in (10.4.13) gives 

^ = — Ay^wr^k, co)T ,m ( k, co) (10.4.14) 

dQ k 

where 

— *kjk m du + 

~ + i$i m&fij (10.4.15) 

If the energy- momentum tensor is a sum of individual Fourier components as 
in (10.4.2), then the field h^ v in the wave zone will look like a sum of the plane 
waves (10.4.7). The gravitational energy- momentum tensor will then be given by a 
double sum over these Fourier components, but all cross-terms drop out when we 
average over a time interval long compared with the longest “beat period,” 
that is, the reciprocal of the shortest frequency difference. The power is thus given 
by a sum of terms like (10.4.14), one for each frequency in the source. 
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Suppose, on the other hand, that the energy-momentum tensor is a Fourier 
integral , as in (10.4.1). Then h in the wave zone will look like an integral over co 
of the individual plane waves (10.4.7), and the gravitational energy-momentum 
tensor will be given by a double integral JJ dco dco' of products of these terms. The 
integrand again has time dependence exp ( — i(co — co')t), but now there is no 
“longest beat period,” so instead of computing the average power we calculate 
the total energy emitted. This is given by integrating the power over all time, and 
the effect is to replace the factors e~ lC0t e lC0 t in the double integral for the power with 

-00 

exp ( — i(co — o/)t) dt = 2n S(co — co') 

J — co 

The energy per solid angle emitted in a direction k is thus a single integral : 

* a> 2 [T Xv *( k, co)T Av (k, co) - i\T\(k, co)\ 2 ] dco (10.4.16) 
0 

or, in terms of the space-space components, 

* 00 

m 2 TV*{k, co)T lm ( k, co) dco 

0 

As an example, consider a system of free particles n that initially move at 

constant velocity v n , collide at the origin at t = 0, and then move off again at 

velocities v„. The energy- momentum tensor is then 

^’(x, () = £ - ^ S 3 (X - vj)0( — t) 

n E n 

+ X ^r"“ ^ 3 ( x - (10.4.17) 

" tin 

where P n ° = E n = m n { 1 — v„ 2 ) - 1/2 andP„ = E n v n are the energy and momentum 
of the nth incoming particle, P n ° = E n and P„ are the corresponding quantities 
for the outgoing particles, and 6 is the step function 

0(c) = { + 1 * > 0 (10.4.18) 

1 0 s < 0 


dE 

dd 




dCl 


The functions 0 and <5 3 have the well-known integral representations 


0(0 = 


J ^00 g+lCOS 


2ni 

<5 3 (x) - 


CO — IB 

1 


(2*)' J 


dco £ 

t? 3 ke ik,x 


0 + 


(10.4.19) 

(10.4.20) 
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(To prove (10.4.19), note that the contour can be closed with a large semicircle 
in the lower or upper half -plane, according to whether s < 0 or s > 0. To prove 
(10.4.20). take the Fourier transform of both sides.) We see then that T flv (x, t ) 
is of the form (10.4.1), with 


T MV (x, co) = 


1 r p vp v 

L y r n x n 

(2 n)H [T E n 


d 3 k 


, /k . x 


co — v„ • k — ie 


y P/PJ f 

■ J 


,ik. x 


d 3 k 


and the Fourier transform (10.4.11) is 

p up v 


T^{ k, co) = 


= 1 Tv - 

2 ni [y E n - 


co — v„ • k + is 

p tip v 


is 


{co - v n • k - is) n E n {co - v„ • k + ie) 


t‘e)_ 


W T e can now drop the ± is in the denominators, for co — v n * k cannot vanish if 
co = |k| and |vj < 1. (For the case of particles traveling at the speed of light, 
see below.) Also, E n (v n • k — co) = P n x k k = (P n • k), so we can write 


] P tip v 

T^{ k, co) - — V -JL£kHn (10.4.21) 

2niN (P N * k) 

where N runs over particles in both the initial and the final states, the sign factor 
n N being 

f + 1 N in final state 
rj , y = < 

— 1 A in initial state 


We note that (10.4.12) is satisfied, because 

2ni n 


and this must vanish because Pn^Wn s i m ply the change in the total P* 4 , 
which is conserved* 

The gravitational energy per solid angle and per unit frequency interval 
emitted at frequency co and direction k is now given by (10.4.16) as 


dE ' 
dQ dco 


y 

2n 2 N t M (Pjv * k){P\f ' ty 


[(P* ■ Pm) 2 


- i m N 2 m M 2 ] 


(10.4.22) 


If we tried to compute the total emitted energy by integrating co from 0 to 00 , 
we would get a result that diverges like j 00 dco. This is just due to our approximation 
that the collision occurs instantaneously; actually it must take up some finite 
time At, and the co-integral will be cut off at co of order 1/A t. 

Note that if none of the momenta P N ^ are changed by the collision, then the 
contributions of the incoming and outgoing particles in (10.4.21) will cancel, and 



4 Generation of Gravitational Waves 


26s 


so v (k, 1 0 ) will vanish. Gravitational radiation is only emitted when the particles 
actually undergo accelerations. 

Note also that (10.4.22) seems to become infinite if one of the particles partic- 
ipating in the reaction (say, N = 1 ) has zero mass, and has momentum approaching 
a direction parallel to k, since then P x • k = E 1 co(P 1 • k — 1) -► 0. However, this 
singularity is spurious, for when becomes parallel to k we can treat (P x • P M ) 
in (10.4.22) as proportional for all M ^ 1 to (& • P M ), so the singular part of 
(10.4.22) is 


Q u > 2 ni y n M 

n 2 (P 1 ’ k) m * 1 (P M • k) 


(P\ ’ Pm) 2 x 2 WmIP 1 ’ Pm) 

M ¥= 1 


We have already remarked that Xm mus t vanish when the sum is extended 

over all particles, so the right-hand side is simply —rj^Pf, and this vanishes 
because particle 1 is assumed to have zero mass. Thus no difficulty is encountered 
in applying (10.4.22) to collisions involving photons, neutrinos, or even (to run 
ahead of ourselves a bit) gravitons. 

The total energy per unit frequency interval emitted as gravitational radiation 
in a collision is obtained by integrating Eq. (10.4.22) over the directions of k. 
We then find 


dE 

dco 


G ^ 

~ N f lM m N m M 

'2% NM 


1 + Pnm 

PnmC L — Pnm) M ' " 


In 


1 + Pnm 
1 — Pnm, 


with p NM the relative speed of particles N and M : 


Pnm 


1 


m N m M 


1/2 


(Pn'Pm) 

For nonrelativistic two-body elastic scattering this reduces to 


dE 

dco 


OUr 2 4- • 2 n 

— u v sin 6 
5tt 


(10.4.23) 


(10.4.24) 


where jx is the reduced mass, v is the relative velocity, and 6 is the scattering angle 
in the center-of-mass reference frame. 

The gravitational radiation produced by the collisions occurring in a gas can 
be determined by summing up the radiated energies per collision given by Eq. 
(10.4.23) or (10.4.24), provided that there is enough time between collisions so that 
they do not interfere. This conditions can be expressed as 


CD p (O c 


(10.4.25) 


where co c is the collision frequency of a typical gas particle. (If co co c , then the 
gas behaves as a fluid rather than as a collection of independent particles.) When 
(10.4.25) is satisfied, the power per unit volume and per unit frequency interval is 


dP 

dco 




V 1 2 

L Wab^a» 
5n ( a,b ) 



6dQ 


(10.4.26) 
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where n a is the number density of gas particles of type a, da ab jd£l is the center-of- 
mass-system differential scattering cross-section, the sum runs over all different 
pairs of particle types, and the average { • • • ) is taken over all collisions. 

As an example, let us calculate the gravitational radiation emitted by Coulomb 
collisions in a plasma. The Rutherford scattering cross-section is 


. = e «\ 2 

dd 4^/4, sin 4 (0/2) 


(10.4.27) 


The integral over 6 must be cut off at a minimum angle 1/ A, with A > 1 determined 
by the Debye screening of the Coulomb force at large impact parameters ; we have 
then 


’^sin *0dn* 4 * e ‘ i -f h * A 

^ Rab V ab 


(10.4.28) 


We are left with an average of v ab , which for a Maxwell-Boltzmann distribution is 

1/2 




v*/w 


(10.4.29) 


Putting (10.4.28) and (10.4.29) into (10.4.26) gives the power per 
per unit frequency interval (in c.g.s. units) as 

0P = 64© / 2kT\ i/2 ln A ^ n, a n b e/e b 2 
dta 5c 5 \ n ) J f , ab 


unit volume and 

(10.4.30) 


Typically In A is of order 10. For a plasma of completely ionized hydrogen we 
must take into account electron- electron and electron- proton collisions, and 
(10.4.30) gives 


dP 

dco 


64 Gn e 2 e 4 


(m m (i 

\nm e ) 


V2) In A 


(10.4.31) 


The electron collision frequency may in this case be estimated as 


e^n^v) ~ e 4r n e 

(m 2 ~ m i,2 ^ e 


(10.4.32) 


Equation (10.4.30) or (10.4.31) holds for co > co c and foco kT . 

These results can be applied to the hydrogen plasma in the solar core. Within 
a volume V of order 2 x 10 31 cm 3 this plasma has T ~ 10 7o K, n e ~ 3 x 
10 25 cm” 3 , and ln A ~ 4. The collision frequency (10.4.32) is 10 15 sec” 1 , three 
orders of magnitude less than the thermal frequency kTjfi « 10 18 sec” 1 , so the 
total power produced in gravitational radiation can be estimated by multiplying 
(10.4.31) by YkT/H. In this way we find that the thermal collisions in the solar 
core produce about 10 8 watts of gravitational radiation. 
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5 Quadrupole Kadiation 

Up to this point we have made no approximations beyond the basic assumption 
that the fields are weak. (Our use of the wave zone limits r > R, r > 1/m, r > coi? 2 
was not really an approximation, since we can always choose r large enough to make 
these assumptions true; and by the conservation of energy, the power passing 
through a sphere at large r must equal that passing through any surface enclosing 
the radiating system.) We now make a further approximation, and assume that 
the source radius R is much smaller than the wavelength 1/co: 


coR < 1 


(10.5.1) 


Most of the radiation is emitted at frequencies of order v/R, where v is some typical 
velocity within the system, so we are really making the same sort of approximation 
as that made in the previous chapter, that is, v 1. 

When (10.5.1) holds we may approximate the Fourier transforms needed in 
(10.4.14) and (10.4.16) by the k-independent integral 


/zyk, co) ~ d 3 xT u (x , CO 


10.5.2) 


inis can oe rewritten m a useiui way oy using tne conservation laws m tne iorm 

d 2 

-A — . T ij (x, co) = -co 2 T 00 {x, co) 
dx l dx J 

Multiplying with x l x* and integrating over x, we find 


T ij{ K, CO) ~ ~ — U ij{ CO) 
JL 


(10.5.3) 


D u (co) = d 3 xx i x J T 00 (x, co) 


(10.5.4) 


The power per solid angle is therefore 


dP _ Geo 6 
dQ 4:n 




(10.5.5) 


If the source is a sum of Fourier components, then the power radiated is a sum of 
terms such as (10.5.5). If the source is a Fourier integral like (10.4.1), then the 
energy emitted per solid angle is 


JTj T P 00 

~ = iOA, JJm {t) w 6 Drj(a>)D lm (co) dco 

f n 


(10.5.6) 
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The coefficients D^co) in (10.5.5) and (10.5.6) do not depend on the direction 
k of the emitted radiation, so we can do the integral over solid angle once and for 
all. We use the formulas 

dClk.L = — < 5 f , 

3 

%) 

doicjcfijcn = ~ (dfa + Sfa + SiJjt) 
lo 

(The form of the right-hand sides is dictated by symmetry and rotational in- 
variance; the numerical coefficients can be calculated by contracting i with j and l 
with m.) We then find 

= ~ [116 fa - Ufa + 

so the power emitted at a single discrete frequency co is 

9 ri m 6 

P = ^- [D * ((0)D (<0) _ i | Z > (j(<B) |2 ] (10.5.7) 

5 


whereas for a smooth distribution of frequencies the total emitted energy is 

P = faDffaDfa) - i|7)„(<»)| 2 ] dm (10.5.8) 

5 Jo 

Before going on to calculate the quadrupole radiation emitted in a few special 
cases, it will be necessary to pause for a few comments on the method of calculation : 

(A) The quadrupole approximation is usually applicable to nonrclat ivisti c 
systems, and for these systems the energy density T 00 (x, co) is dominated by the 
rest-mass density of the system. It may be surprising that we do not need to take 
explicit account of the potential and kinetic energy terms in the full tensor TW 
because such terms must be included if T^ v is to be conserved ! Indeed, for a system 
of particles bound by gravitational forces we should in principle take T^ v as the 
total “tensor” constructed in Section 7.6, including terms nonlinear in the 
gravitational fields. However, we have already exploited energy and momentum 
conservation in our derivation of Eqs. (10.5.3)-(10.5.6), and since this has given 
a result involving only T 00 , we are now free to approximate T 00 with the rest- 
mass density. 

(B) For general systems of vibrating and/or rotating solids, it is often quite 
difficult to evaluate the Fourier transform T 00 (x, co), defined by Eq. (10.4.1) 
or (10.4.2). It is much easier first to evaluate the moments 

D tj {t) = f d 3 xx i x j T 00 (x, t) 


(10.5.9) 
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and then evaluate D^{(o) by expressing D t fi) as a Courier integral, 


Du(t) 


% a 

Jo 


da}D u (co)e 


+ c.c. 


or as a sum of Fourier components, 


(10.5.10) 


D t] (t) = £ <T + c.c. (10.5.11) 

(O 


(C) The question may arise : What origin should be taken for the coordinates 
x 1 in the integral (10.5.4) for In principle, it doesn’t matter. When we shift 
the origin of coordinates by an amount a t , we change D into 


[ {x* - a’)(Xj - aj)T 00 (x, t) d 
= jx‘x J T 00 


(x, t ) d 3 x — a 1 


x J T 00 (x, t) d 5 x - a J 


x i T 00 (x, t ) d 3 x 


+ a l a j 


T 00 (x, t) d 3 x 


But conservation of energy and momentum tells us that the last three terms are at 
most linear functions of time, because 


d_ 

dt 


T 00 {x, 


t ) d 3 x 


» p. 

A T‘°(x, t) d 3 x = 0 
8x‘ 


8 2 

se 




x i T 00 (x, 


t) d 3 x 


8 2 

x 1 — t— , T ik (x , t) d 3 x 
8x J 8x k 

- dL T ij (x, t) d 2 x = 0 
dx J 


Thus the shift in origin does not affect the Fourier components with m ^ 0, that is, 


D u (qj) 


I 


x i x j T 00 (x, co) d 3 x 


(x l - a l ){x j - a j )T 00 (x, oj) d 3 x 

(10.5.12) 


However, it is only when we take T 00 as the energy density of the entire system 
that we can shift origins freely in computing D i j(a>). 

As a first example, let us calculate the gravitational radiation produced by a 
sound wave in a tube lying in the 2 -direction. The density of the vibrating material 
can be written 

P = Po + Pi 
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where p 0 is the constant unperturbed value and p l is a small perturbation. We 
also treat the material velocity v (in the 2 -direction) as a small perturbation, and 
neglect dissipative effects. The equations of motion are then 


dp 


dv 


— - + p 0 — — 0 
dt P0 dz 



+ V, 


2 

dz 


= 0 


where v s is the speed of sound. The tube is not supported at its ends (otherwise 
we would have to take into account the gravitational radiation emitted by the 
support!), so the pressure v s 2 p t must vanish at the tube ends. With this boundary 
condition, the general solution for a tube extending from 2 = 0toz = i;isa 
superposition of the normal modes 

v — —sv s cos kz sin {cot + cj)) (10.5.13) 

p i — £p 0 sin kz cos (cot + cj ) ) (10.5.14) 

where s is a small dimensionless number, cj) is an arbitrary phase, and 


k = N - a) = N 7i ^ 

L L 


(10.5.15) 


with N any positive integer. Since v is not constrained to vanish at the tube ends, 
these ends will in general be displaced by amounts 5(0, t) and 5(L, t), respectively, 
where 

<5(2, t) = f v(z, t) dt = ev s co~ 1 cos kz cos (cot + cj)) 


The time -dependent part of the second-moment of the mass density is given here by 


D u (t) = n^jA 


p Y (z , t)z 2 dz + L 2 p 0 S(L , t) 


where A is the cross-sectional area of the tube, and n = (0, 0, 1) is a unit vector 
in the 2 -direction. This vanishes for N even, whereas for N odd we find 




4:npijML 2 £ 


cos (cot + cf>) 


where M = PqAL is the mass of the tube. (The reader can easily check that D^t) 
would be the same if the second moment of the mass distribution were evaluated 
using as origin some point other than 2 = 0.) Comparing with Eq. (10.5.11), we 
see that D t j(t) has a Fourier component 


v. 


2n i n-ML 2 £ 
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The radiated power is thus given by Eq. (10.5.7) (in c.g.s. units) as 

„ 1 6GM 2 v*e 2 

P — — 

1 5L 2 c s 


(10.5.17) 


for each odd N. This may be compared with the total energy of the oscillation, 
which is simply the kinetic energy at the times when p x vanishes, that is, when v 
is greatest : 


$ — iPqA 


f « 2 m.x( z ) dz = \Mv s 2 £ 2 


0 


Evidently the emission of gravitational radiation will cause the oscillator to lose 
energy at a rate 


_ P 64 GMv* 
~ E ~ 1 5T 2 c 5 


(10.5.18) 


For instance, let us calculate the rate of gravitational radiation by acoustic 
oscillations in the large aluminum cylinders used as antennas in Weber’s experi- 
ments on gravitational radiation. 1 (As we shall see, the effective cross-section of 
such antennas is determined by r grav .) Weber’s cylinders have the parameters 

h = 153 cm v s — 5.1 x 10 5 cm/sec M — 1.4 x 10 6 gm 

Hence, if gravitational radiation were the only loss mechanism, the oscillations 
(10.5.13), (10.5.14) with N odd would lose energy at a rate 

r grav = 4.7 x 10" 35 sec" 1 (10.5.19) 

In contrast, the actual decay rate T of the N = 1 mode in this cylinder is about 
0.15 sec -1 , owing primarily to viscous dissipation within the aluminum. Hence 
the “branching ratio” of gravitational radiation here is of order 


n = 


tiH? ~ 3 x 10 34 

r 


(N = 1) 


(10.5.20) 


Any ordinary mechanical oscillation will always give up vastly more of its energy 
to heat than to gravitational radiation. 

As a second example, let us calculate the power radiated by a rotating body. 
If the body rotates rigidly about the 3-axis with angular frequency T, then the mass 
density T 00 will take the form 

T 00 (x, t) = p(x') 

where p(x') is the mass density expressed in coordinates x' fixed in the body, 
defined by 

aq = x[ cos — x* 2 sin 

x 2 = x[ sin + x 2 cos fit 

x 3 = 
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Hence, by changing coordinates in (10.5.9), we may express D t j(t) in terms of 
the moment -of-inertia tensor in body-fixed coordinates 


I: 


d 3 x'x' i x , jp(x') 


(10.5.21) 


For simplicity, let us consider rotation around one of the principal axes of the 
ellipsoid of inertia, so that I 13 — I 23 = 0. We can also choose the x\ and x 2 
axes along the two other principal axes, so that I l2 = 0. With 7 fj . diagonal, we 
now find 

(0 = i(In + 7 22 ) + 1 1 — ^ 22 ) cos 

■^12(0 ~ ~~ 1 22) 

^22(0 = ~b 1 22) ~~ Wtt ~~ 1 22) cos 

(t) = D 23 (t) = 0 

-^33(0 = -^33 

The nonvanishing Fourier coefficients for co = 2Q in Eq. (10.5.11) are then 

7) 11 (2Q) = —D 22 (2Cl) = iD l2 { 2Q) = ^(7 lt — I 22 ) 

According to Eq. (10.5.7), the total power emitted at twice the rotation frequency 
is then (in c.g.s. units) 

P(2fi) = 32ga6/2e2 (10.5.22) 

5c 5 


where I and e are the moment of inertia and equatorial ellipticity, 


I = 1 11 + 1 22 


e 


11 


- I 
I 


22 


(10.5.23) 

(10.5.24) 


A body with circular symmetry around the axis of rotation will have e = 0, and 
therefore will not emit gravitational radiation. (Indeed, this conclusion does not 
even depend on the quadrupole approximation, since such a body, though rotating, 
has a time-independent energy-momentum tensor.) On the other hand, for a point 
mass m fixed in the rotating coordinate system at a point x[ — r, x 2 = x' 3 = 0, 
the only nonvanishing element of I tj will be / lt = mr 2 , so that I = mr 2 and 
e = 1, and Eq. (10.5.23) gives a radiated power 


P(2Q) = 


32 GQ 6 m 2 r 4 


(10.5.25) 


For instance, for the orbital motion of the planet Jupiter, we have 

m = 1.9 x 10 30 g, 


Q = 1.68 x 10“ 8 sec -1 , 


r = 7.78 x 10 13 cm 
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and Eq. (10.5.25) gives a gravitational radiation power of only 5.3 kW, less even 
than the solar thermal gravitation power calculated in the last section. At this 
rate, it would take very much longer than the age of the solar system to observe 
any effects of this energy loss on Jupiter’s orbit. 

The negligibility of gravitational radiation in celestial mechanics can be stated 
in more general terms. For a system consisting of particles with typical mass M, 
typical separations r, and typical velocities v, the power radiated at a frequency 
CD of order v/f will be of order [compare Eq. (10.5.7)] 


or, since GMjr is of order v J 



tTF V 

M - 


The typical deacceleration u rad of particles in the system owing to this energy loss 
is given by the power P divided by the momentum Mv, or 

r 


This may be compared with the accelerations computed in Newtonian mechanics, 
which are of order v 2 /?, and with the post-Newtonian corrections discussed in the 
last chapter, which are of order v^jf. [Radiation effects enter with an odd power 
of v because they represent an irreversible process, as shown by our use of an out- 
going wave solution in Eq. (10.4.4).] Since radiation reaction is smaller than the 
post-Newtonian effects by a factor v 3 < 10“ 12 , the neglect of radiation reaction 
in the last section was perfectly justified. Indeed, if we had strength we could 
even compute the post-post-Newtonian accelerations, 2 which are of order v 6 jr, 
without encountering the effects of gravitational radiation ! 

The discovery of the pulsars has provided us with a more promising source of 
gravitational radiation. As discussed in Section 11.4, pulsars are probably neutron 
stars, 3 with masses of the order of one solar mass, radii of the order of 10 km, and 
hence moments of inertia I of the order of 10 45 g cm 2 . A newborn pulsar, formed 
in a supernova, may be rotating with Q of order 10 4 sec -1 , so according to Eq. 
(10.5.22), it would be emitting gravitational radiation at a rate of order 10 55 e 2 
ergs /sec. For comparison, the total rotational energy of the pulsar would be about 
10 5 3 ergs, so most of the pulsar’s kinetic energy would be radiated away as gravita- 
tional waves 4 within a few years, provided that the equatorial ellipticity e is 
greater than about 10“ 4 . This is too large a static ellipticity to be maintained in 
the huge gravitational field of a neutron star, but it might be possible for dynamical 
effects to produce a mean ellipticity this large, particularly in the early period 
before the pulsar has settled down to its equilibrium configuration. Eventually 
the pulsar will slow down sufficiently so that other loss mechanisms, such as 
magnetic dipole radiation (for which P oc Q 4 ) become more important than 
gravitational radiation. 
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6 Scattering and Absorption of Gravitational Kadiation 

Consider a plane gravitational wave with polarization e and wave vector k fl , 
impinging on a target at the origin. At great distances from the target, the gravita- 
tional wave will in general consist of the plane wave and an outgoing scattered 


-V( x > *) 


iwr~ 

+ /„v( S ) 

r 


( 10 . 6 . 1 ) 


where r = |x|, x = x/r, co = |k|, and / is a scattering amplitude , which may 
depend on x and co, but not on r or 1 

In order to analyze the energy balance between the gravitational wave and the 
target, it is necessary to decompose the wave (10.6.1) into incoming and outgoing 
parts. The plane- wave part has the Legendre expansion 6 

00 

e lk ' x = £ (2 1 + l)Pj(k • xtfj^cor) 

1 = 0 

where P t is the usual Legendre polynomial and is the spherical Bessel function 7 
of order l. Asymptotically, we have 8 


i l ji{wr) 


( — ) l e~ i<or ] 


so the sums over l become simply the Legendre expansions of delta functions 9 : 

I(2« + l)Pi(R) = 2 8(1 ^ fi) 

i 

1 ( 2 1 + 1 )(-) l PM = 2 8(1 + a) 

l 

The plane wave may therefore be asymptotically decomposed into outgoing and 
ingoing waves, 

iwr ^ f - itar 

e ik.x __ — ^(1 _ k • x) - - — 8( 1 + k • x) 
r_ * 00 icor icor 

and the gravitational wave (10.6.1) has the corresponding decomposition 


^ K> ltar + + c - c - 


( 10 . 6 . 2 ) 


C( x ) = — Kv <5(1 — k • x) + icof (x)] 


(10.6.3) 


C( X ) = - 7— e «v <5(1 + £ • x ) 


( 10 . 6 . 4 ) 
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The total power carried out of a large sphere of radius r by the outgoing wave 
part of (10.6.2) may be calculated, following the same reasoning as in Section 10.4, 
as 


p — 
1 out — 


dQ(C t )^r 


(10.6.5) 


where <T°' t ) the me &n energy flux, obtained by using eJJJ* in place of c /lv in Eq. 
(10.3.5), and averaging over space-time dimensions large compared with i/Wand 
small compared with r. Inspection of Eq. (10.6.3) shows that P out will consist of 
three terms, 

^out = -^scat + -Pint + P p i an e (10.6.6) 

which arise, respectively, from /^ v alone, from the interference between /^ v and 
e MV , and from e^ v alone. The first term, which represents the total power scattered 
away from the incident direction, is calculated by using (10.3.5) in (10.6.5), with 
fftv/r in P^ce ofe^: 

dQ.[f* v *(x)f, v (x) - i|/\(x)| 2 ] (10.6.7) 


p = 
■*- scat 


or 


UnG 


The interference term is similarly calculated as 

Pirn = ^ Re j- T |bfl<5(l - k • x)[e^*/, v (x) - 

SnG (. J J 

or, integrating over the delta function, 


P int = - Im {**•*/„(£) - ie\*FM (10-6.8) 

4tr 

The last term in (10.6.6), which represents the power carried out of the sphere 
by the plane wave, is formally infinite for r oo. However, a plane wave carries 
as much power into any volume as it carries out of it, so the power brought into 
a sphere of radius r by the ingoing wave (10.6.4) is also equal to this term 

Pin = ppiane (106.9) 

Thus Ppi ane cancels out of the equation of energy conservation, which gives the 
power absorbed by the target as 

^abs = Pin - P OM = ~P M ~ Pint (10-6.10) 

The energy flux in the incident wave is given by Eq. (10.3.5) as 

® («*’%, - i|e\| 2 ) 

1671 6r 


( 10 . 6 . 11 ) 
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Thus the effective cross-section for elastic scattering of the gravitational wave is 


^scat 



[e iv %v - iKvl 2 ] 


( 10 . 6 . 12 ) 


This must be distinguished from the total cross-section for scattering or absorption 
of the wave : 


°tot 


scat 


+ P* 




(10.6.13) 


According to (10.6.10), the total cross-section can be expressed in terms of the 
interference between the incident and scattered waves, 


P; 


<7*„* — — 


int 


or, using (10.6.8) and (10.6.11), 

4?r Im (e Av */ Av (k) - je\*f\{k)} 
co{e* v *e Av - i\e\\ 2 ) 


hot 


(10.6.14) 


(10.6.15) 


This result, that the total cross-section is 47i/t» times the imaginary part of a 
forward scattering amplitude, was first derived in classical electrodynamics, 10 
and is therefore known as the optical theorem. Here and in electrodynamics it is a 
consequence of the conservation of energy, whereas in quantum mechanics there is 
a similar theorem based on the conservation of probability. 1 1 

Since the incident wave is weak, the scattering amplitude f Av is a linear 
combination of the components of the incident polarization tensor e pa . It follows 
that the cross-sections (10.6.12) and (10.6.15) are independent of the normalization 
of e , though they may depend on k and on the form of the polarization tensor. 
The aim of gravitational scattering theory is to calculate f kv ; following this, the 
various cross-sections can be determined from (10.6.12) and (10.6.15). 


7 Detection of Gravitational Radiation 

Experiments that aim at the detection of gravitational radiation have been 
carried out by Weber over the last decade, 1 and are presently being planned in 
laboratories throughout the world. Most of these experiments make use of resonant 
quadrupole antennas , which can be any “small” mechanical or hydrodynamical 
system with a natural mode of free oscillation. It happens that the effective cross- 
sections of these antennas can be evaluated by use of the optical theorem derived 
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in the last section, without any need for a detailed analysis of the interaction be- 
tween the gravitational wave and the antenna. 

Our first assumption is that the antenna is much smaller than the wavelength 
2n /a), so that the scattered gravitational wave is pure quadrupole radiation. By 
the same reasoning that earlier led to Eqs. (10.4.10), (10.4.12), (10.5.3), and (10.5.4), 
we may now conclude that the scattering amplitude in Eq. (10.6.1) takes the form 

4v(x) = w*) - l 10 - 7 - 1 ) 

where is proportional to the Fourier transform of the perturbation in T^ v 
caused by the wave; the conservation of energy and momentum give, as before, 


W*) = - £ /jk W*) = &i&hj 


(10.7.2) 


where t tj is independent of x, though depending of course on o, e /lv , and on the 
detailed interaction between the incident wave and the antenna. Adopting a 
coordinate system in which the incident propagation vector k is in the 3-direction, 
and a gauge in which the only nonvanishing elements of the polarization tensor are 
e 11 — — e 22 and e 12 — e 21 , the total cross-section (10.6.15) is now given by 


2k Im - t 22 ) + 2e^ 12 } 

<y[l e n | 2 + kml 2 ] 


(10.7.3) 


Also, the angular integral in (10.6.12) can now be calculated by the same method 
as in Section 10.5, and we find for the elastic scattering cross-section the value 


_ — 3 Kill 2 ] 

~ 5[l e iil 2 + |e 12 | 2 ] 


(10.7.4) 


Our other assumption is that the scattering is resonant, that is, that the 
frequency a) of the incident wave is close to a natural frequency co 0 of free oscilla- 
tion of the antenna system. We can think of the incident wave as merely serving 
to excite this free oscillation, which then loses energy through reradiation of 
gravitational waves or into other channels, corresponding to elastic scattering or to 
absorption of the incident wave, respectively. 

One consequence of this assumption is that the ratio of the elastic scattering 
cross-section to the total cross-section is simply equal to the fraction rj of the 
energy of the free oscillation that is dissipated as gravitational radiation rather 
than heat, light, and so on, 


where 


^"scat Wtot 


(10.7.5) 


n 


r 

x gray 

r 


with r the total decay rate of the free oscillation and r grav the decay rate owing to 
the emission of gravitational radiation. Since rj is a parameter characterizing the 
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free oscillation of the antenna, and has nothing to do with how the oscillation is 
excited, it is independent of e flv . 

Another consequence of the assumption of resonant scattering is that the 
form of the matrix t t j is given by some fixed matrix n depending only on the 
geometric properties of the oscillation being excited. That is, . must equal up- 
times some function of the polarization components e n and e 12 . The incident field 
is assumed here to be weak, so this latter function must be linear, and therefore 

hj = + j5e 12 ) (10.7.6) 

with n i j, a, and /? all independent of e 1 x and e 12 . For instance, if the antenna has an 
axis of symmetry along some direction n, then n tj . is a linear combination of 
and the term proportional to does not contribute to (10.7.3) or (10.7.4), 
so in this case we could take 


nij = n{rij (10.7.7) 

The two requirements, (10.7.5) and (10.7.6), impose stringent conditions on 
the scattering amplitude. Using (10.7.3), (10.7.4), and (10.7.6) in (10.7.5), we find 


- Im {[*?i(»n - ™ 22 > + 2ef 2 » 12 ][oe n + j8e 12 ]} 

CO 

= f l ae n + Pe 12 \ 2 [nf}n u - i\n u \ 2 ] 

This must hold for all e so by equating the coefficients of |e x x | 2 , e* 1 e 12 , and 
I®i 2 1 2 > we obtain the conditions 


-h Im {(ran - ra 22 )a} = — 1— {(»,, - ra 22 )/J - 2nf 2 a*j 
|a| 2 ia*p 


= jpf Im 

Dt] 


The solution of these equations takes the form 


a = — 


n - ^ 22 ) 


p = 


2(o[nfjn,j - Hra.-jl 2 ] 
5 tI9 n *2 

a>l n ij n ij - iNiil 2 ] 


where g is a complex number with 


Im g = |gi 2 


( 10 . 7 . 8 ) 
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The scattering amplitude (10.7.6) is now 

t = 5g t l n iji( n *i ~ "*2)en + 2nf 2 e 12 ] ? „ 

" - iKI 2 ] 

Xote that this depends only on the form of the matrix n t j , not on its normalization. 

The final consequence of the assumption of resonant scattering is that the 
frequency dependence of the scattering amplitude is given by the Fourier 
transform of a function with the time -dependence 

g- it»ofg-r*/2 


which oscillates at frequency co 0 , and decays in amplitude and energy at the rates 
r/2 and T, respectively. That is, t t j must have the frequency dependence 

(10.7.10) 

Since t] and n tj depend on the properties of the free oscillation, and not on how it is 
produced, this frequency dependence can only arise from the factor g. In order to 
satisfy the “unitarity” condition (10.7.8) for all co, we must then have 


hj °c 


CO 


COq + 


.r 

* 2 


-r/2 

CO — COq + ir/2 


(10.7.11) 


The total cross-section (10.7.3) for absorption or scattering of gravitational waves 
by the antenna is then, in e.g.s. units, 


°tot 


' 5ro?c 2 \ / T 2 /4 \ 

CO 2 J \(m - W 0 ) 2 + r 2 / 4 / 

^ 1(^11 ^22)^11 ^^12^12 1 

[ntn,; - i|» ii | 2 ][|e 11 | 2 + |c 12 | 2 ] 


(10.7.12) 


It is truly remarkable that this cross-section is entirely determined as a function of 
e MV and co by the parameters co 0 , T, and rj, and by the form of the matrix n i j , 
whether the resonant oscillation is mechanical, acoustical, electrical, or anything 
else. 

In the special case of an antenna with an axis of circular symmetry n, the 
matrix n has the simple form (10.7.7). Taking the symmetry axis to lie in the 
1 — 3 plane at an angle 6 to the incident 3 -direction, the nonvanishing elements 
of this matrix are 


n 11 = sin 2 6 


n 13 — cos 6 sin 6 


n 33 — cos 2 0 
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The total cross-section (10.7.12) is then 


®"tot 


lbnr]c 2 \ l 1 2 /4 

2co 2 ) \(ce> - a ) 0 ) 2 + T 2 /4 

<( ^l 2 

\knl 2 + K 2 I7 


sin 4 9 


(10.7.13) 


The factor sin 4 9 makes the cross-section greatest when the antenna axis is 
oriented at right angles to the direction of wave propagation, that is, for 9 = 71 / 2. 
This is just a reflection of the fact that gravitational waves, like electromagnetic 
waves, are transverse. 

When the polarization of the gravitational wave is not measured, the quantity 
of interest is the average of (10.7.12) over the helicities +2, that is, over polariza- 
tion tensors with e xl — +ie 12 : 


— 


5nt]c 2 
2 oj 2 


r 2 /4 

(CO - co 0 ) 2 + r 2 /4 


11 


% 2 i 2 + 4 k 2 r 




(10.7.14) 


For an antenna with an axis of circular symmetry, the effect of averaging over 
helicities is just to replace the last factor in Eq. (10.7.13) with 

The above analysis is strictly applicable only where there is a single non- 
degenerate resonant oscillation. When there are several degenerate modes, the 
particular linear combination excited by a gravitational wave can depend on the 
polarization of the wave, so that t ij need not be proportional to a fixed matrix 
n t j. For instance, if the antenna is an elastic sphere, then any quadrupole oscilla- 
tion will consist of five independent modes. In this case, t t j must be a linear com- 
bination of Su and e tj , but again the term proportional to <5 fj - does not contribute 
to (10.7.3) or (10.7.4), so we can take 


t u = y*i 


Equations (10.7.3)— (10.7.5) now give 


Im y = ?? | 7 | 2 
5ij 


so, since y must have the frequency dependence (10.7.10), it is given by 


y = 


5?7 

2co 


-r/2 \ 

(D — (D 0 + ^r/2y 


The total cross-section (10.7.3) is now 


^tot 


107H7C 2 \ / T 2 /4 

co 2 ) \( 0 ) - co 0 ) 2 + r 2 /4 


(10.7.15) 


for any incident polarization. 
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In all cases the effective cross-section has a maximum when the antenna is 
tuned so that the resonant frequency co 0 is equal to the frequency to of the incident 
wave. Inspection of (10.7.12)-(10.7.15) shows that this maximum cross-section 
is of the order 

<r mas * n X 2 (10.7.16) 


where X is the wavelength 27rc/co. In the ideal case of an oscillation that decays 
purely through the emission of gravitational radiation, we would have rj — l, 
and cr max would have the very large value X 2 . Of course, this ideal case is never 
even approached in practice; for instance, we found in Section 10.5 that Weber’s 
large aluminum cylinders have t\ ~ 3 x 10“ 34 ! Generally the rate r grav at which 
a resonant oscillator emits gravitational radiation depends on the gross dimensions 
of the antenna, and is difficult to increase ; thus, in order to make n max as large as 
possible, it is necessary to reduce the total loss rate T in the ratio r] = F grav /r 
as much as possible, possibly by employing some sort of oscillation in a superfluid. 

However, tuning our antenna does no good unless we have some strong source 
of gravitational radiation with a known frequency to which we can tune. Perhaps 
the most promising source 12 is the pulsar NP 0532 in the Crab nebula. This object 
is observed to emit pulses of electromagnetic radiation at optical. X-ray, and 
radio frequencies with a period 27 t/Q = 0.03309 sec. As discussed in Section 10.5, 
the pulsars are believed to be rotating neutron stars 3 with moments of inertia of 
order 10 45 gem 2 and unknown equatorial ellipti cities e. The Crab pulsar is there- 
fore presumably emitting gravitational radiation, with co ~ 2Q = 379.8 Hz, at a 
rate of about 10 45 e 2 ergs/sec. Since the Crab nebula is at a distance of 6500 light 
years, or 6.2 x 10 21 cm, the flux of gravitational radiation passing the earth 
should be about ® ~ e 2 ergs/sec-cm 2 . A resonant linear quadrupole antenna that 
is “aimed” at and tuned to the Crab pulsar will have a cross-section d tot given by 
(10.7.13) as 7.4 x 10 16 tj cm 2 . The power absorbed by the antenna will thus be of 
order 10 16 e 2 ?7 ergs/sec. For instance, if rj is 10“ 32 and e is of order 10“ 4 , this 
power is of order 10“ 24 ergs/sec, which might perhaps be detectable. Unfortunately, 
in order to use an aluminum cylinder of the sort discussed in Section 10.5 as an 
antenna tuned to the Crab, the cylinder would have to have the rather ungainly 
length nvjco of 42 meters. To get around this difficulty, one can use as antennas 
hoops, forks, and so on, which have lower natural frequencies for a given size than 
bars or cylinders. A group at Rochester 13 is planning a hoop antenna that could 
be tuned to the Crab. 

All of the experiments carried out so far by Weber have made use of resonant 
quadrupole antennas that are not tuned to any particular source. Since it is too 
much to ask that a monochromatic source like a pulsar should just happen to fall 
within the bandwidth of the antenna, these experiments really aim at the detection 
of broad-band gravitational radiation, with an energy flux cp(co) dco between 
frequencies c 0 and 0 + dco. If exposed to such radiation, the power absorbed by a 
resonant antenna will be 


P = 


•Un ax 


/* 


r 2 /4 __ 

(a) - a> 0 .) 2 + T 2 /4 




O(m) dco 
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where <T max is the effective cross-section of the antenna at resonance, given by 
setting co = m 0 in (10.7.12), (10.7.13), (10.7.14), or (10.7.15). If ®(rn) is roughly 
constant over the frequency range co 0 — F to co 0 4- T , then it can be taken out of 
the integral, and we have 

P = t (10.7.17) 

Z 


For a source that radiates for a time much longer than the antenna relaxation 
time 1 /r, a quasi-steady state will be reached in which the mean energy E in the 
resonant mode is such that the loss rate EF just balances the absorbed power P: 



7l<T r 


$(<*><)) 

2 


(10.7.18) 


In this case, a measurement of the mean excitation energy of the resonant mode 
serves to measure, or at least to set an upper limit on, the power flux at the re- 
sonant frequency. For instance, the earth has a fundamental spheroidal oscillation 
mode 14 0 $ 2 , with a period 27i/ft) of 54 min and a decay rate F of order 5 x 10 - 6 
sec' 1 , in which the mass density perturbation is of the form p 1 (r)Y 2 m (0, (p ). The 
gravitational decay rate r grav of this mode will be roughly of order GM 0 i£ e 2 m 4 /c 5 
[compare Eq. (10.5.18)], or about 10"" 25 sec - 1 , so the branching ratio t] is of order 
10 ~ 2 ° cross _ sec tioii (10.7.15) at resonance is here 7.5 x 10 21 t] cm 2 , or 

roughly 10 7 to 10 8 cm 2 . From seismic measurements of the mean strain in the 
earth’s crust during quiet periods, Forward et al. 15 in 1961 set an upper limit 
on ®(m 0 ) of roughly 20 watts/cm 2 -Hz. It is hoped that a much better upper limit 
on ® can be set by placing a gravimeter on the moon, 16 which is very much quieter 
seismically than the earth. 

For a “burst” source that radiates for a time r less than the antenna relaxation 
time 1/r, the total energy picked up by the antenna will be 

AE = Pt = 7T<7 max <I>(tt) 0 )T t 

z 


Thus the energy per unit area in the burst reaching the antenna within the beam 
width r may be determined as 


®(m 0 )rT = (10.7.19) 

However, if the source radiates for a time r < 1/r, its bandwidth must be greater 
than 1/r, so the total energy per unit area in the burst must be larger than <^by a 
factor greater than (rr) - h 

The only positive indication so far of the presence of gravitational radiation 
in the universe comes from the experiments of Weber, 1 which use as antennas the 
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aluminum cylinders described in Section 10.5. These antennas have the frequency 
and “branching ratio” 

c o 0 I2k = 1660 Hz rj = 3 x 10“ 34 

[see Eq. (10.5.20)] so by setting c o ~ (O 0 and averaging over helicities in Eq. 
(10.7.13), we find a cross-section at resonance 

Umax = 2.9 x 10" 20 sin 4 6 cm 2 

If the smallest energy increment AE that can be distinguished from thermal 
fluctuations is hT, or 4 x 10" 14 ergs at room temperature, then according to 
(10.7.19), a burst of gravitational radiation will be detectable if the energy <cfper 
unit area within the beam width satisfies the condition 

> 9 x 10 5 ergs/cm 2 for 6 — n/2 

(It is actually possible to do a little better than this by careful data processing.) 
The mere observation of a number of pulses in a single cylinder would leave open 
the possibility that these pulses were due to nonthermal noise, such as seismic 
disturbances, electric storms, or cosmic rays, so Weber looked for coincident 
pulses in aluminum cylinders 1000 km apart, at College Park in Maryland and the 
Argonne National Laboratory in Illinois. In 1969 Weber reported over 100 coin- 
cident pulses, occurring at a rate that indicates a mean gravitational radiation 
flux (within the bandwidth F ~ 0.1 Hz) of about 0.1 erg cm" 2 sec" 1 . 17 

Shortly thereafter, 18 Weber found that the rate of coincident pulses was 
correlated with sidereal time, in a manner consistent with the expected sin 4 6 
antenna pattern if the gravitational radiation is coming from the center of our 
galaxy. (See Figure 10.1.) The galactic center is about 2.5 x 10 2 2 cm from the 
earth, so an observed flux of 0.1 erg cm" 2 sec" 1 would indicate an energy produc- 
tion of about 8 x 10 44 ergs/sec, or 0.013 M 0 c 2 /year. This would not in itself be so 
remarkable, but since Weber’s antennas are not tuned to any particular frequency, 
an energy production of 0.01 M 0 c 2 in a bandwidth of 0.1 Hz at 1660 Hz pre- 
sumably represents a total energy production 10 3 to 10 5 times larger, or 10 to 10 3 
if 0 c 2 /year. At this rate, the whole mass of the galaxy would be used up in 10 8 
to 10 1 0 years ! If Weber is really observing gravitational radiation from the galactic 
center, then either he accidently picked the precise frequency at which most of 
this radiation is emitted, or else he has discovered an incredibly powerful new 
source of energy. 

Weber has also looked for scalar radiation, using a disk with a monopole 
mode of oscillation having the same frequency, 1660 Hz, as the cylinders. The 
coincidence rate is observed to be much less than for the pair of cylinders ; the 
apparent correlation of coincidences with sidereal time agrees with a pure tensor 
theory. 1 9 

Plans are now in train to repeat Weber’s experiments with much greater 
sensitivity. One important improvement that is being planned at Stanford 20 is 
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Figure 10.1 Evidence for gravitational radiation emanating from the center of the 
galaxy. 18 The detector intensity observed by Weber is plotted here (in arbitrary units) 
against sidereal time. Arrows mark the sidereal times at which the antenna is most 
nearly perpendicular to the line of sight to the center of the galaxy. Numbers in circles 
give the observed numbers of coincidences in each time interval. 


to operate a cylindrical antenna at a very low temperature, in the range of milli- 
degrees Kelvin. If the antenna is limited by thermal noise, then lowering the 
temperature by a factor 10“ 5 would increase the sensitivity by a factor 10 5 . A 
group at Moscow 21 is carrying out gravitational radiation experiments with im- 
proved instrumentation, and is designing novel kinds of gravitational wave 
antennas. 22 Weber is continuing his observations, using new antennas and 
instrumentation. At present, the best a theorist can do is to wait for the experi- 
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mentalists to reach some sort of consensus on whether gravitational radiation has 
indeed been observed. 


8 Quantum Theory of Gravitation* 


At present there does not exist any complete and self-consistent quantum 
theory of gravitation, and it would be out of place in this book to describe in 
detail the attempts that have been made to construct such a theory. However, it 
will be possible and it may be useful to give the reader some taste of what a 
quantum theory of gravitation would be like. 

To start at the simplest level, we would interpret a gravitational plane wave, 
with wave vector 1c ^ and helicity + 2, as consisting of gravitons : quanta with 
energy-momentum vector jj 11 = and spin component in the direction of 
motion + 2ft. (Here ft = 1.054 x 10“ 27 erg sec). Since Icjc M = 0, the graviton 
is a particle of zero mass, like the photon and neutrino. According to Eq. (2.8.4), 
the energy-momentum tensor of an assembly of gravitons, all of which have four- 
momenta = ftkT, is 

= jr ( 10 . 8 . 1 ) 

CO 

where Jf is the number of gravitons per unit volume. Comparing this with our 
result for a gravitational plane wave, 

<Q = ^<M 2 + M 2 ) < 10 - 8 - 2 > 


we conclude that the number density of gravitons with helicity + 2 in a plane 
wave is 




The total number density is 


jf = jr+ + jr_ 


CO 

\6nftG 


M 2 


CO 

16nftG 


(e 


Av*. 


'Xv 


i\e\n 


(10.8.3) 

(10.8.4) 


In the same way, we can interpret our formula (10.4.13) for the power emitted 
as gravitational radiation by an arbitrary system as giving the rate dT of emitting 
gravitons of energy ftco into the solid angle dQ : 

dr = dP = Gadn [3 u„ (jfc> m)TkA k, w) - i| T\{k. w)\ 2 ] (10.8.5) 

ftco ftn 


However, the energy-momentum tensor T kx (lc, co) must now be interpreted as a 
matrix element of an energy-momentum tensor operator between final and initial 

* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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states, in particular, in the quadrupole approximation the total rate for an atom 
to make a transition a b by emitting gravitational radiation is 

T(a - b) = — ^ [D*j(a - b)D tJ (a - 6) - *| £„(« - J)| 2 ] 

Oil 


where 


D u (a 


b) = m e 


*A*( x )^i^A( x ) ^ 3x 


( 10 . 8 . 6 ) 

(10.8.7) 


with i js a , ijj b the initial and final state wave functions. For instance, the rate for 
decay of the 3 d(m — 2) state of the hydrogen atom into the Is state with emission 
of one graviton is 


T{Sd ->• Is) = 


2 23 Gm e 3 c 
3 7 5 15 (137) 6 ^ 2 


= 2.5 x 10“ 44 sec^ 1 


Needless to say, there is no chance of observing such a transition. 

The above estimates apply to a process in which a transition occurs because 
a graviton is emitted, so that the graviton has a definite frequency a> — (E a — 
E b )IH. We can also consider a process that is going on anyway, such as a collision 
between particles, and ask what is the probability of a graviton being emitted 
during the process. Here the possible graviton frequencies form a continuum, so 
we use our formula (10.4.22) for the emitted energy, and divide by Hco. The prob- 
ability of emitting a graviton in the solid angle dQ and in a frequency range dco 
is then 


dP 


Goo 1 d(D dQ. P r 




2k 2 Hoo n,m {P jv * k)(P M * k) 


l( p n ■ Pm ) 2 ~ i™ K 2 m M 2 ] 


( 10 . 8 . 8 ) 


where P c is the probability of the collision occurring without graviton emission, 
and where once again the sums over N and M run over all particles in the initial 
(^ = — 1) or final (rj = +1) states. This formula has also been derived by purely 
quantum mechanical methods. 23 

It should be noted that the emission probability dP is proportional to dao/ao 
(the factors P • k in the denominator being proportional to m), so the total prob- 
ability for emission of gravitational radiation in a collision diverges logarithmically, 
both at oo oo and oo —*■ 0. The first, or “ultraviolet,” divergence was encountered 
classically, and arises just because of our approximation that the collision occurs 
instantaneously ; it is to be eliminated by cutting off the co-integral at oo ~ 1 /A t ~ 
Ejfi , where At is the duration of the collision and, via the uncertainty principle, 
E is some typical energy characteristic of the collision. The second, or “infrared,” 
divergence at oo = 0 is a purely quantum mechanical problem; it enters here 
only because we divided the emitted energy dE by He o to get the emission prob- 
ability. It is removed by recognizing that P c , the probability for the collision to 
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occur without gravitational radiation, is itself logarithmically divergent because 
of emission and reabsorption of virtual gravitons, and that the divergences 
cancel. 24 We see that once we have accepted the most elementary ideas about the 
quantum nature of gravitational radiation, we are inevitably led to the full 
infrastructure of real and virtual gravitons. 

The quantum interpretation of gravitational radiation allows a simple 
derivation of the relations between absorption and emission of gravitons. Imagine 
a black-body cavity in a body of temperature T that is so large and dense that it is 
opaque to gravitational radiation. The cavity will be filled with both electro- 
magnetic and gravitational radiation in equilibrium with the container. By using 
the same statistical arguments that give the Planck distribution law for electro- 
magnetic radiation, 25 we may conclude that the number of gravitons per unit 
volume with frequencies between co and m + dm is 


n(m) dm = 


m 2 doj 



(10.8.9) 


where k — 1.38 x 10“ 16 erg/°K is Boltzmann’s constant. (It is crucial in the 
derivation of this result that gravitons, like photons, have two independent 
polarization states.) In order for equilibrium to be maintained, it is necessary that 
the absorption rate A(m) of a single graviton in the container wall be related to 
the rate per unit volume E{m) dm of graviton emission between frequencies go 
and w + dco by 

A(m)n(m) doo = E{m) dm (10.8.10) 


This can also be written 26 

E{m) — 1(d) + S(co) 

where 


8(a>) 

/(co) 


co 2 \ / km\ 

^ exp 


= n(m) exp[ - 


( 10 . 8 . 11 ) 

(10.8.12) 

(10.8.13) 


We interpret S(m) as the rate per unit volume and per unit frequency interval of 
spontaneous emission of gravitational radiation. [Equation (10.8.12) can also be 
derived from the “crossing symmetry” between emission and absorption; co 2 /7r 2 
is a “phase-space” factor, and exp(— km/kT) is a Boltzmann factor representing 
the relative probability that an atom is in an upper level waiting to emit a graviton 
or a lower level waiting to absorb a graviton.] The remaining term I(m), which is 
proportional to n(m), is interpreted as the rate per unit volume and per unit 
frequency interval of induced emission of gravitational radiation, an effect due to 
the Bose statistics of the gas of gravitons. 27 

The useful thing about Eqs. (10.8.12) and (10.8.13) is that they remain valid 
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even if the gravitational radiation is not in equilibrium with matter, so that n(oo) 
is not given by Eq. (10.8.9). It is only necessary that the matter be in thermal 
equilibrium at temperature T. For instance, we can calculate the rate S(co) of 
spontaneous emission of gravitons per unit volume and per unit frequency interval 
in a nonrelativistic gas by dividing Eq. (10.4.26) by hoo, provided that the graviton 
frequency <x> is in the range co c co <4 IcTjh. Applying Eq. (10.8.12) then gives 
the absorption rate of such gravitons as 


AM = 


8 nG 
5hco 3 


£ /4«a f v - 

(o,6) 


d °ab 

dQ 


sin 2 9 dQ 


This co~ 3 behavior can make A (<x>) surprisingly large for low-frequency gravitons 
in gases at high temperature. However, the effect of induced emission is to reduce 
the effective absorption rate by a factor hoj/tcT. There does not appear to be any 
situation in the present universe where the absorption of gravitational radiation 
plays any important role. 

The preceding remarks describe what may be called a semiclassical theory of 
gravitation. The development of a true quantum theory of gravitation is un- 
fortunately much more difficult. One approach is to construct an interaction 
Hamiltonian that can create and destroy gravitons, and then calculate transition 
probabilities as a power series in this interaction. Usually the Hamiltonian would 
be built up out of quantum fields, of the form 


h ev( x ) = £ 


d 3 k{a{k, n)e pv { k, /x) exp(^/) 


+ a t (k, p)e* v (k, /x) expf-t^ar 1 )} (10.8.14) 

where e pv (k, /x) is a polarization tensor for a graviton of momentum hk and helicity 
/x, and a(k, /x) and a t (k, p) are the corresponding annihilation and creation operators , 
characterized by the commutation relations 

[a(k, /x), fl^k', p')] = <5 3 (k - k')^ (10.8.15) 

[a(k, /x), a{k', p')] = [a T (k, /x), a f ( k', /x')] = 0 (10.8.16) 


The difficulty in this approach comes from the fact that the operator (10.8.15) 
cannot be a Lorentz tensor as long as the helicity sum is limited to the physical 
values /x = +2; as we saw in Section 10.2, a true tensor would have helicities 
0 and + 1 and v r ell as + 2. It is true that we can start with a true tensor and then 
subject e /lv to a gauge transformation that will eliminate the unphysicai helicities 
0 and + 1, but once we choose a gauge in this way, h flv is no longer a tensor. To put 
this another way, a gauge condition, such as the statement that e 13 , e 23 , e l0 , e 20 , 
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e 00 , e o 3 > an d e 3 3 vanish for k in the 3 -direction, is not Lorentz invariant, so if we 
define these components to vanish, then under a Lorentz transformation A^ v , h px . 
will not simply transform into Af A°h pa , but will be subjected to an additional 
gauge transformation 28 : 


h 


fiv 


A/A, 




It is no easy task to construct a Hamiltonian out of such an object in such a way 
as to obtain Lorentz -invariant transition probabilities. 

There are two possible ways out of this difficulty. One possibility is to accept 
the nontensor character of h pV , and use the nonco variant Hamiltonian formalism 
to derive Lorentz-invariant rules for the calculation of transition amplitudes. 29 
This works fairly easily in electrodynamics, but the self-interaction of the grav- 
itational field has so far prevented the completion of this program in general 
relativity. A different method, pioneered by Feynman, 30 is to start out with 
manifestly Lorentz-invariant calculational rules, and then tinker with them to 
prevent the appearance of unphysical particles with helicities 0 or + 1 in physical 
states. This program has been successfully carried through to completion in the 
work of Fadeev, 31 Mandelstam, 32 and DeWitt. 33 

Unfortunately, the formulation of general rules for the calculation of transition 
probabilities in the quantum theory of gravitation has only confirmed the presence 
of another difficulty: The theory contains infinities, arising from integrals over 
large virtual momenta. Quantum electrodynamics contains similar infinities, but 
only in three or four special places, where they can be dealt with by a renormaliza- 
tion of mass, charge, and wave functions. 34 In contrast, the quantum theory of 
gravitation contains an infinite variety of infinities, as can be seen by an elementary 
dimensional argument: The gravitational constant has dimensions him, 1 , so a 
term in a dimensionless probability amplitude of order G" will diverge like a 
momentum-space integral \p 2n ~ 1 dp. In this respect, the theory of gravitation is 
more like other nonrenormalizable theories, such as the Fermi theory of beta 
decay, than it is like quantum electrodynamics. 

Despite these difficulties, there is one very important conclusion that can 
already be drawn from the quantum theory of gravitation : It is quite impossible 
to construct a Lorentz invariant quantum theory of particles of mass zero and 
helicity +2 without building some sort of gauge invariance into the theory, 23,28 
because only in this way can the interaction of the nontensor field k generate 
Lorentz-invariant transition amplitudes. However, we saw in Section 10.2 that 
the theory of gravitational radiation is gauge-invariant because general relativity 
is generally covariant, and, as argued in Section 4.1, general covariance is but the 
mathematical expression of the Principle of Equivalence. It therefore appears that 
the Principle of Equivalence, on which the whole of classical general relativity is 
based, is itself a consequence of the requirement that the quantum theory of 
gravitation should be Lorentz invariant. 
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9 Gravitational Disturbances in Gravitational Melds* 


The foregoing sections have described a Lorentz-invariant theory for the 
behavior of weak gravitational waves in a Minkowskian space- time. It will be 
useful later, in Chapter 15, on cosmology, to have available a generally covariant 
theory for the propagation of weak gravitational disturbances in a preexisting 
gravitational field g pv . 

According to Eq. (6.1.5), if g is changed by some disturbance to g pv + Sg v . 
with dg uv small, then to first order in Sg v , 


, _ esr^ esr^ 


+ - stij- 1 - srln 


UK A XtJ 


At] X flK 


where &TjL is the change in the affine connection: 


arj v = - 


a kp da 4 - 1 a kp ^&9p p , d&9pv _ dfypv 

9 9 P a „v 2 9 |^ v "I" ^ dxP 


We note that 5F^ V can be expressed as a tensor: 

<5rj[ v = \g kp [{dg pp ). v + (Sg pv ). p - {dg pv ) J 


(10.9.1) 


the covariant derivatives being of course constructed using the unperturbed affine 
connection F^ v . Since <5F^ V is a tensor, the change in the Ricci tensor can also be 
written in terms of covariant derivatives : 


= m tK - (arid* (io.9.2) 

This is known as the Palatini identity. In terms of 5g pv , it reads : 

= y ip mx P ) ; ^ - < 10 - 9 - 3 > 

- (fypJviX + ( 3 9 „k);p-,x} 

The Einstein field equations are here presumed to be satisfied for the un- 
disturbed gravitational field g and energy-momentum tensor T . The condition 
that they should also be satisfied for g MV + Sg pv and T MV + 3T pv is then 

- ( 3 9pp);K-,X - ( 3 9 pk)-,p;>. + 

= -87tflT W,, - ig,,y6T„ + igpMx^” - idgp V T\] (10.9.4) 
Also, the source term ST obeys the conservation law: 

0 = (ST vp ) ;ti + T vk 3T p k + T kp 8T v pk (10.9.5) 

The general covariance of these equations is manifest. 

* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 

reading. 
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Just as for gravitational waves in a Minkowskian space-time, it is important 
here to distinguish physical disturbances from mere changes in the coordinate 
system. To this end, let us consider a general infinitesimal coordinate transforma- 
tion 

_► x ,fl = x* - eF(x) (10.9.6) 

where ^(x) is an arbitrary infinitesimal vector field. The partial derivatives 
occurring in the tensor transformation rules are here 

d x ftl _ ^ _ d8*{x) 

~dx v ~ v ~ ~d^~ 


— = 5 " + + 

dx'" “ 8x » 


0(e 2 ) 


Since Einstein’s equations are generally covariant, and g^ix) is a solution for an 
energy- momentum tensor T^x), it follows that g^ v {x) is a solution for T' (x), 
where 


= g’tvW) + «*(*) + 0 ( £ 2 ) 


= a. .Ax) + a. Ax) 


de x (z) 




arrix) na 


and likewise for T'^{x). In covariant terms, we conclude that 

sCv(z) = ?„,(*) + 4M*) (10.9.7) 

is a solution of Einstein’s equations for an energy-momentum tensor 

Zy*) = TJx) + A s TJx) (10.9.8) 

where 

M,, = e„. v + e v;fl (10.9.9) 

A,r„ v = + T A v c- ;)i + T uv . x rJ- (10.9.10) 

(Note that A E g^ v has the same form as except that g MV has vanishing covar- 

iant derivatives, whereas T^ v does not.) It follows, and it is straightforward to 
verify directly, that Sg^ v — A £ g^ v is a solution of the field equation (10.9.4) for a 
source perturbation ST — A Z T . But Eq. (10.9.4) is a linear differential equa- 
tion, and so, given any solution Sg^ v , we can always find other solutions of the 
form Sg + A E g^ v with precisely the same physical content. The freedom to add 
terms A £ g MV for arbitrary functions e^x) corresponds to the “gauge invariance” 
discussed in Section 10.1. 

The operator A £ introduced in Eqs. (10.9.9) and (10.9.10) can be generalized 
to arbitrary tensors by specifying that a term involving the contraction of the 
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tensor with the covariant derivative of s should be included with a + sign for 
each co variant index and a — sign for each contra variant index. That is, for 
scalars we define 

MS = S ;X e* 

for vectors we define 

= v + v 
A £ t/“ = - CV. A + 

for contravariant and mixed tensors of second rank we define 

A s 2 , ' lv = - J 7 " a e v . a + 2 , ' ,, ; j1 6 a 

A £ y" v s -rViA + 

and so on. The operator A E defined in this way is known as the Lie derivative. In 
general, the effect of an infinitesimal coordinate transformation on any tensor T 
is that the new tensor equals the old tensor at the same coordinate point, plus the 
Lie derivative A e T. It is easy to show that the operator A £ has the same abstract 
properties as ordinary derivatives or covariant derivatives : It is linear, 

A £ M\ + bB\] = aA E A\ + b\R\ 

(for a, b constant scalars) 


it obeys the Leibniz rule, 

A £ {A\B k ) = B%A\ + A\A E B k 
and it commutes with the operation of contraction, 

= -T v V;v + 

In particular, the Lie derivative of the energy -momentum tensor for a perfect 
fluid is 


= + 9p V & e P + (P + p)[L fl A E U v + U X A E U ^ + U^U V [ A £ p + A £ p] 

so A E g fiV is a solution of Einstein’s equations for a fluid whose velocity, pressure, 
and density are perturbed by A e £7 , A E p, and A E p, respectively. 

The solution of the field equations (10.9.4) is quite complicated, except for the 
simple case of a homogeneous and isotropic unperturbed metric g fiV . This case will 
be considered in Section 15.10. 
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“Things fall apart; the centre cannot hold, 
Mere anarchy is loosed upon the world.” 
W. B. Yeats, The Second Coming 


II STELLAR EQUILIBRIUM AND 
COLLAPSE 


Gravitational fields are so weak that the practicing astrophysicist can usually 
ignore general relativity. This chapter deals with various sorts of objects in which 
relativistic effects play an important, or in some cases a dominant, role. One of 
these is the neutron star, a “cold” star composed primarily of neutrons and 
supported against collapse by neutron degeneracy pressure. Another is the super- 
massive star, a giant object supported by radiation pressure, in which general- 
relativistic effects can tip the balance between stability and instability. Most 
impressive of all is the black hole, a body caught in an inexorable gravitational 
collapse. 

The existence of neutron stars and black holes was suggested in the 1930’s on 
purely theoretical grounds, chiefly through the work of J. Robert Oppenheimer 
and his collaborators. However, these exotic objects remained a textbook 
curiosity until the 1960’s, when the cooperative efforts of radio and optical 
astronomers began to reveal a great many strange new things in the sky. 

First came the quasi-stellar objects (QSO’s), objects with starlike optical 
images, often containing powerful compact radio sources, and with red shifts 
AA/A ranging from 0.131 to nearly 3. (See Figure 11.1.) One can suppose three 
different sorts of explanations for these red shifts : They can arise from a Doppler 
effect, caused either by a local explosion or by the general cosmological recession 
of very distant objects (see Chapter 14), or they can arise from powerful gravita- 
tional fields within the objects themselves. In any case, it is likely that general- 
relativistic effects will play an important role in the explanation of the QSO’s. 
If these objects are relatively near but moving at relativistic velocities, then some 
source of energy must be found that could convert mass into kinetic energy, with 
nearly 100% efficiency. If the QSO’s are at cosmological distances, then their 
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Figure 11.1 Four quasi-stellar objects. These photographs were taken with th- 
200-in. telescope at Mt. Palomar, following position determinations by radio astron- 
omers. (Courtesy Mt. Wilson and Palomar Observatories.) 


apparent optical luminosity indicates an absolute luminosity much greater than 
that of the largest galaxies, so again a powerful new source of energy is required. 
Only gravitational attraction seems to offer an adequate energy source, and for 
this reason the discovery of the QSO’s reawakened general interest in the phenom- 
enon of gravitational collapse. Finally, if the QSO red shifts are gravitational, 
then these objects must be so highly compressed that their structure would have 
to be understood in terms of general relativity rather than Newtonian mechanics. 
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The quasi-steiiar objects are only the most spectacular end of a continuum 
of ill- understood objects discovered in recent years, including Seyfert galaxies, 
giant elliptical galaxies with powerful compact radio sources, X-ray sources, 
galactic nuclei that seem in some cases to be exploding, and so on. It is not clear 
what, if anything, general relativity has to do with these objects. 

In the last few years a new species of astronomical exotica was discovered — 
the pulsars, radio sources that pulse at regular frequencies ranging from a few 
tenths Hz to 30 Hz. The pulsars are often associated with optical and even X-ray 
sources that pulse at the same rate. There appears now to be a general consensus 
that pulsars are the neutron stars discovered theoretically in the 1930’s, but with 
a rapid rate of rotation that somehow or other produces the observed pulses. 

A realistic discussion of quasi- stellar objects, galactic nuclei, pulsars, and so 
on, would require that we consider the effects of radiative energy transport, 
neutrino energy transport, turbulence, nuclear forces, magnetic fields and, above 
all, rotation. It would also require the discussion of massive calculations using 
automatic computers. In preparing this chapter, I have tried to restrict myself 
to the simplest calculations, which can be carried out analytically without too 
much trouble. These simple calculations are not very useful for a detailed under- 
standing of astronomical observations, but they provide a valuable insight into 
the possible roles that general relativity can play in astrophysical phenomena. 


1 Differential Equations for Stellar Structure 

We first set up the general-relativistic machinery for computing the pressure, 
density, and gravitational fields within a spherically symmetric static star. 

The metric will be taken in the ‘ 'standard” form discussed in Section 8.1 : 


ffrr = A (r), 9ee = r 2 , 9W = »" 2 sin 2 B, 
= 0 for p # v 


g„ = —B(r) 

( 11 . 1 . 1 ) 


The energy-momentum tensor is assumed to be that for a perfect fluid (see Section 
5.4) : 

= P9„v + (P + P) u „ u v (11.1.2) 


with p the proper pressure, p the proper total energy density, and the velocity 
four-vector, defined so that 


iTujj, 


-1 


( 11 . 1 . 3 ) 


Since the fluid is at rest, we take 


u, = -i-g'T ' 11 = - -jB(r) (11.1.4) 


U,= V e =U v = 0; 
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Our assumptions of time independence and spherical symmetry imply that p and 
p are functions only of the radial coordinate r. 

By making use of Eqs. (11.1 .1)— ( 11.1.4) and the Ricci tensor components 
given by Eq. (8.1.13), we find that the Einstein equations (7.1.15) read 


R ee = -1 + “ 

R« = ~ + 



B’\ 

A' _ 

+ 

— - 



B ) 

rA 

A ' 

B’\ 

1 

— 

+ — 


A 

B ) 

1 A 

'A' 

+ B '\ 

B f 

~A 

B ) 

rA 


-4? iG{p - p) A (11.1.5) 

= -4 nG{p - p)r 2 (11.1.6) 

= -4nG(p + 3 p)B (11.1.7) 


A prime denotes d/dr. (We do not need to write down the equation for which 
is identical to that for B e0 , or the equations for off-diagonal elements of 
which simply say that zero equals zero.) In addition, we may recall the equation 
(5.4.5) for hydrostatic equilibrium, 


P + P 


( 11 . 1 . 8 ) 


Uur nrst step m solving tnese equations is to derive an equation tor Ji(r) 
alone, by forming the quantity 


B rr _j_ RqO _j_ ^tt 

2 A r 2 2 B 


I _JL 

? + At 2 


— SnGp 


(11.1.9) 


This equation can be written 


1 — SnGpr 2 


The solution with ^4(0) finite b 




11 . 1 . 10 ) 


(li.i.n; 


Ji{r) = 4:Tcr ,2 p(r') dr' 


( 11 . 1 . 12 ) 


We can now use (11.1.11) and (11.1.8) to eliminate the gravitational fields 
A{r), B(r) from Eq. (11.1.6), which becomes 

_ 2GL#ir rp' "] GJi 2 Am v 2 

— 1 + 1 — [I 1 -i — 47 zGpr 2 = — 4-nG(p — p)r 2 

L r JL p + p] r 
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W e rewrite this as 

-,y<„ - “'">4 + SI 1 - ‘-STI 1 - ^T‘ 

(11.1.13) 

The reader may recognize this differential equation as the fundamental equation 
of Newtonian astrophysics (see Section 11.3), with general-relativistic corrections 
supplied by the last three factors. 

We are primarily concerned in this chapter with stars that are isentropic, 
that is, in which the entropy per nucleon s does not vary throughout the star. This 
is the case for two very different kinds of star : 

(A) Stars at Absolute Zero. When a star exhausts its thermonuclear fuel it can 
become a white dwarf (Section 11.3), or a neutron star (Section 11.4), in which the 
temperature is essentially at absolute zero. According to Nernst’s theorem, the 
entropy per nucleon will then be zero throughout the star. 

(B) Stars in Convective Equilibrium. If the most efficient mechanism for energy 
transfer within the star is convection, then in equilibrium the entropy per nucleon 
must be nearly constant throughout the star, because otherwise a small element 
of fluid containing A nucleons could gain or lose an energy A As I T when transported 
from one part of the star to another, and convection would therefore disturb the 
energy distribution. The supermassive “stars” discussed in Section 11.5 are 
generally presumed to be in convective equilibrium. 

We also assume that the stars we consider have a chemical composition that is 
constant throughout. 

The importance of the preceding assumptions lies in the fact that the pressure 
p may in general be expressed as a function of the density p, the entropy per 
nucleon s, and the chemical composition. Hence, with s and the chemical com- 
position constant throughout the star, p(r) may be regarded as a function of p(r) 
alone, with no explicit dependence on r. 

Given p(r ) as a function p{p{r)), we now formulate our problem as a pair of 
first-order differential equations for p(r) and Ji{r). One of these is Eq. (11.1.13); 
the other is the derivative of Eq. (11.1.12): 

= ±nr 2 p{r) (11.1.14) 

In addition, Eq. (11.1.12) provides an initial condition: 

J£( 0) = 0 (11.1.15) 

Equations (1 1.1 .13)— (1 1 .1 .15), together with an equation of state giving p(p ), 
serve to determine p(r), Jd{r), p(r), and so on, throughout the star, once we specify 
the other initial condition, that is, the value of p(0). The differential equations 
(11.1.13) and (11.1.14) must be integrated out from the center of the star, until 
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p(p(r)) drops to zero at some point r = R, which we then interpret as the radius 
of the particular star with central density p(0). 

Let us return to the problem of calculating the metric. Once we compute 
p(r), and p(r), we can immediately obtain A{r) from Eq. (11.1.11); to find 

B[r) we use Eq. (11.1.13) to rewrite (11.1.8) as 


& 

B 


2 0 


\Jl 4 - 4nr 3 p] 


1 - 


2GJf \~ 1 
r 


The solution with B( oo) = 1 is 


f f 00 

B(r) — exp<^ ■ 


~ [Jt(r') + 4nr f3 p(r')] 
r 2 


1 - 


2 GJti?) 
r’ 



(11.1.16) 


Our solution is now complete. (Incidentally, we did not need to use Eqs. (11.1.5) 
and (11.1.7) for R rr and R tt separately, because these equations follow from 
(11.1.6), (11.1.8), and (11.1.9), which were used in our calculation. This should not 
be surprising, because Eq. (11.1.8), which is really just the equation for momentum 
conservation, follows from the Einstein equations (11.1.5)— (11.1.7) via the Bianchi 
identities.) 

Outside the star, p(r) and p(r) vanish, and Ji(r) is the constant M{R), so Eqs. 
(11.1.11) and (11.1.16) give 


B(r) = A 1 (r) = 1 — for r > R (11.1.17) 

r 


The discussion of Section 8.2 shows that the constant Ji{R) that appears in the 
asymptotic gravitational field (11.1.17) must equal the mass M of the star, defined 
as the total energy of the star and its gravitational field, that is, 


M - Jt(B) 


f 4nr 2 p(r) dr 


(11.1.18) 


Thus (11.1.17) is just the familiar exterior Schwarzschild solution. 

It may appear paradoxical that M, which must include the energy of the 
gravitational field, is given in (11.1.18) as the integral of the energy density p(r) 
of matter (including radiation) alone. The resolution is that (11.1.18) does not say 
that M is the total energy of the matter. The total material energy is not really well 
defined, but it might be computed by splitting up the star into small volume 
elements and adding up the energies of each element as measured in a locally 
inertial reference frame ; this would give the material energy as 

4nr 2 y/ A(r)B(r) p(r) dr (11.1.19) 


M r 


= \/g 


p dr d9 d(j) 


o 
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The difference between (11.1.18) and (11.1.19) can be regarded as the energy of 
the gravitational field. However, this decomposition is not particularly useful, and 
will not be employed here. 

It is more informative to compare (11.1.18) with the energy M 0 that the 
matter of the star would have if dispersed to infinity. This is simply 


EEL q — 771 


( 11 . 1 . 20 ) 


where m N = 1.66 x 10 24 g is the rest-mass of a nucleon and N is the number 
of nucleons in the star. The nucleon number is given by 


N = 



dr d6 d(p — 


CR 


4:nr 2 V A(r)B(r) J N °(r) dr 


0 


( 11 . 1 . 21 ) 


where J is the conserved nucleon number current. It is convenient to express 
J N ° in terms of the proper nucleon number density n, that is, the nucleon number 
density measured in a locally inertial reference frame at rest in the star, which is 

n = -CW 1 = -JbJ s ° (11.1.22) 


(See Eq. (11.1.4), and recall that in a locally inertial coordinate frame U 0 = — 1.) 
Equation (11.1.21) then becomes 


N 


f*R 

= 4t ir 2 \J A{r) 

Jo 


?i(r) dr 


W 4 nr 2 1 - 2GUr ( r > ] 1/2 n ( r ) dr 


(11.1.23) 


The proper number density n(r) is in general a function of the proper density p(r), 
the chemical composition, and the entropy per nucleon s, so n(r) and N are fixed 
for a star with a given constant s and chemical composition, once we choose p(0). 
The internal energy of the star is now given by 


E = M - m N N (11.1.24) 

We can also define a proper internal material energy density as 


and write (11.1.24) as 


e(r) = p(r) - m N n(r) 
E — T + V 


(11.1.25) 

(11.1.26) 


where T and V are the thermal and gravitational energies, respectively, of the 
star: 


T = 


% R 

4:nr 2 

0 


1 


2 GJi{r)~ 
r 


1/2 

e(r) dr 


(11.1.27) 


V = 


P 4 nr 1 


j _ 2Gl#(r) ~|~ 1/2 


p(r) dr 


(11.1.28) 
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Expanding the square roots gives 

T = j * 4w 2 |l + < M^ r l + ...| e (r)rfr (11.1.29) 

T . f*, ,f GJt{r) %Q 2 Jl 2 (r) ) , 4 , , A% 

V = - I 4 nr 2 | ^ ' + ^ r 2 + ' ' ' j P( r ) dr (H.1.30) 

The first terms in T and V are recognizable as the Newtonian values for the thermal 
and gravitational energies of the star; in particular, note that the first term in V 
mav be written 


<• R 


G 


4nrJt(r)p(r) dr = 


_g r* i 

2 Jo r 


d(J( 2 (r)) 


GM : 


2 B 


' R dr - 


If 

2 Jo 


Ji(r) d<p(r) 


= i r 
— ^ Jo 


cj)(r) d,Ji(r) 


(11.1.31) 


where cj) is the Newtonian potential, given inside the star by 

GM 


4>(r) = 


R 


- G 


r-wids 

^' 2 


The higher terms in T and V are discussed in Section 11.5. 

To repeat our main conclusion: Once we specify that a star has a definite 
uniform entropy per nucleon and chemical composition, all properties of the star, 
including p(r), p(r), n(r), e(r), M, N, and E, are determined as function of the 
central density p(0). This is not the case for ordinary stars like the sun, in which 
the entropy distribution is not uniform and has to be determined from the 
equations of radiative equilibrium. However, the considerations of this section do 
provide an adequate basis for the study of the exotic structures discussed in this 
chapter. 


2 Stability 

Our work is not done when we obtain a solution of the fundamental equations 
(11.1.13), (11.1.14). Such a solution represents an equilibrium state of the star, 
but it may be a state of stable or of unstable equilibrium. For most purposes it is 
only the stable solutions that interest the astrophysicist. 

In order to tell whether a particular configuration is unstable it would in 
general be necessary to compute the frequencies a) n of ail normal modes of the 
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configuration and check whether any a> n has a positive imaginary part; in this 
case the factor exp( — ia) n t), which gives the time variation of this mode, would 
grow exponentially and the system would be unstable. However, it is often 
possible to tell from the equilibrium solution alone whether the corresponding 
configuration is stable, by making use of the following theorem 1 : 

Theorem 1. A star, consisting of a perfect fluid with constant chemical 
composition and entropy per nucleon, can only pass from stability to instability 
with respect to some particular radial normal mode, at a value of the central 
density p(0) for which the equilibrium energy E and nucleon number N are 
stationary, that is, 

a J(p(0) _ 0 

M o) 

aff(p(0) _ 0 

am 

By a “radial” normal mode is meant a mode of oscillation in which the density 
perturbation dp is a function of r and t alone, and in which nuclear reactions, 
viscosity, heat conduction, and radiative energy transfer play no role. 

The first step in the proof is to note that dissipative forces are absent here, 
so the dynamical equations are time-reversal-invariant, and give the squared 
frequencies co„ 2 of the various normal modes as real continuous functions of p(0), 
just as in an electrical circuit without resistors. For each co n 2 > 0 there are two 
modes that undergo stable oscillation. For each a) n 2 < 0 there are two modes, 
one of which is exponentially damped and one of which grows exponentially, as 
exp( — |mJ£) and exp(+ \(O r \t), respectively. Thus the transition from stability to 
instability can only occur at a value of p{0) for which co n 2 vanishes. 

Consider some value of p(0) for which a particular frequency co n is nearly 
zero. Then it takes a long time for the oscillation or growth of this mode to change 
the equilibrium configuration into some neighboring configuration p(r) -j- dp{r). 
Since this is going on so slowly, p(r) + dp{r) must also be essentially an equilibrium 
configuration. In the absence of nuclear reactions, the new configuration will have 
the same uniform chemical composition as the old one. In the absence of viscosity, 
heat conduction, or radiative energy transfer, the new configuration will also have 
the same entropy per nucleon as the old one. Moreover, the conservation of energy 
and of nucleon number tells us that the new configuration will have the same 
energy E and baryon number N as the old one. However, <5p(0) cannot vanish, 
because an equilibrium configuration is entirely specified (for a given uniform s 
and chemical composition) by the value of p(0); if <5p(0) were zero, then 5p{r) 
would be zero for all r, and the normal mode would be absent. Thus at a point of 
transition from stability to instability there are neighboring equilibrium con- 
figurations with different values of p(0), but with the same uniform entropy per 
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nucleon and chemical composition, and with the same E and N, as was to be 
proven. 

This theorem is particularly valuable because we can often use qualitative 
arguments to show that an equilibrium configuration is stable for p( 0) sufficiently 
small (or large) and unstable for p( 0) sufficiently large (or small); the theorem 
tells us precisely where the transition from stability to instability occurs. As a 
guide in such qualitative considerations, it is helpful to reformulate the funda- 
mental equations of stellar structure in a variational principle 2 : 

Theorem 2. A particular stellar configuration, with uniform entropy per 
nucleon and chemical composition, will satisfy the equations (11.1.12), (11.1.13) 
for equilibrium, if and only if the quantity M , defined by 

e 

M = <lnr 2 p(r) dr 

is stationary with respect to all variations of p(r) that leave unchanged the 
quantity 

N = 1 4 nr 2 n{r) 1 - 2aj/(r > ~j 1/2 dr 

and that leave the entropy per nucleon and the chemical composition uniform 
and unchanged. [It is understood here that with the entropy per nucleon and 
chemical composition fixed, the equation of state gives both p(r) and n(r) as 
functions of p{r).] The equilibrium is stable with respect to radial oscillations if and 
only if M , or equivalently E, is a minimum with respect to all such variations. 

To prove this theorem we use the Lagrange multiplier method 3 : M will be 
stationary with respect to all variations that leave N fixed if and only if there 
exists a constant A for which M — AN is stationary with respect to all variations. 
In general, the change in M — AN for a given variation Sp{r) is 



(The integrals are carried to infinity for notational convenience; actually the 
integrands vanish outside a radius R + SR,) These variations are supposed not to 
change the entropy per nucleon, so 
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and therefore 


Sn{r) = 

p{r) + p(r) 


6Ji{r) = 4z7ir' 2 Sp(r') dr' 


Interchanging the r and r' integrals in the last term, we now have 
5M - XSN = [” 4nr 2 ll — ["l - \ U2 

Jo l Pi?) + p{r) L r J 

- XG 4 nr'n(r') |\ - 26Ur ( r ') j V% dr'j Sp(r) dr 

Thus SM — 76 N will vanish for all 6p(r) if and only if 

1 _ n(r) r 2GJf(r )~\~ 1/2 

I p{r) + p(r) |_ r J 

+ G 4nr'n(r’) j^l - dr' 

This will be the case for some Lagrange multiplier X if and only if the right-hand 
side is independent of r, that is, if and only if 

0 = f *' _ "(*»' + p') l [1 _ 2 ^T 1/2 

\p + P (P + p) 2 J I r 


Gn \ Jt 

Mnrp 

p + p I r z 


InGrn 1 — 


m\ r x _ 2 Gjf 

r 2 ] _ r 


~ x _ 2 GJT ‘ 
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The condition of uniform entropy per nucleon gives 

0 = + 

dr \n J dr\nj 

and therefore 

»'(r) = n(r)p ' (r) 

p(r) + p(r) 

Therefore SM vanishes for all Sp(r) that give 6N = 0, if and only if 
— r 2 p' — 6rfl — 2LT// 1 ^ + p][Jf + 47 zr 3 p] 
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as was to be proved. If the term in SM of second order in Sp{r) is positive -definite 
for all perturbations, then energy must be supplied in order to produce any 
perturbation, and the star is stable. On the other hand, if SM can in second order 
be negative for some perturbation Sp(r), then this perturbation can grow with an 
increase in kinetic energy, and the star is unstable. 


3 Newtonian Stars: Poly tropes and White Dwarfs 

Most of the stars in the sky are adequately described by Newtonian physics, 
without taking account of general relativity. Such Newtonian stars deserve some 
attention here, both because they serve us limiting cases for the more exotic 
objects that interest the general relativist, and because they can guide us in 
understanding the qualitative properties of these objects. 

In Newtonian astrophysics the internal energy and pressure are very much 
less than the rest-mass density, 

e m N n p m N n (11.3.1) 

so that total density is dominated by the density of rest -mass, 


and also 


p ~ m N n 


p p 4:7tr 3 p <| M 

In addition, the gravitational potential is everywhere small, so 


2GJf 


< 1 


r 

The fundamental equation (11.1.13) thus simplifies to 

— r 2 p'{r ) = GJ%(r)p(r) 


( 11 . 3 . 2 ) 


(11.3.3) 


(11.3.4) 


with M[r) still defined by 


v 

4tt r ,2 p{r') dr ' 
0 


(11.3.5) 


Dividing (11.3.4) by p(r) and differentiating allows us to combine both (11.3.4) 
and (11.3.5) in a single second-order differential equation: 


llL d _m= _ 4 nGr*p(r) 

dr p(r) dr 


(11.3.6) 
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In order that p(0) be finite, it is necessary that p'(0) vanish. Thus, given an equation 
of state p = p{p) (with dpjdp ^ 0), we can obtain p(r) by solving Eq. (11.3.6) 
with the initial conditions that p( 0) have some given value and that 


P'( 0) = 0 


(11.3.7) 


(Eq. (11.3.7) also follows from the requirement that p(r) be an analytic function 
of x, y , and z at x = y = z = 0.) 

We still need to prescribe an equation of state. It is often the case that the 
internal energy density is proportional to the pressure, that is, 

e = p — m N n = {y — 1 )~ l p (11.3.8) 

(Here (y — l) -1 is just a constant proportionality coefficient; y will not be the 
ratio of specific heats unless e and p are proportional to the temperature.) The 
condition of uniform entrop}^ per nucleon then reads 



and therefore 
or, since p ~ m N n, 


1 f df 1\ fl\dp) 

— 7 \tp v M + -hrr 

y — 1 [ dr \nj \n ) dr J 


p oz n y 


p = Kp y 


(11.3.9) 


The proportionality constant K depends on the entropy per nucleon and chemical 
composition, but it does not depend on r or on p(0). Any star for which the equation 
of state takes the form (11.3.9) is called a polytrope. 

The fundamental equation (11.3.6) can, in the case of a polytrope, be trans- 
formed into a convenient dimensionless form. Define a new independent variable 
by 


r 


' Ky 
4:izG{y - 1 ) 



2)/2£ 


(11.3.10) 


and a new dependent variable 0, by 

P = p(O)0 1/<J “ 1) p = Kp( oygyiiy-u (11.3.11) 

Equation (11.3.6) then takes the form 

J_ 2 1/(y _ D = 0 (11.3.12) 

i 2 di di 


The boundary conditions are 


0(0) - 1 0'(O) - 0 


(11.3.13) 
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[See Eq. (11.3.7).] The function 0(f) defined by (11.3.12), (11.3.13) is known as the 
Lane- Emden function* of index (y — l) -1 . For f near zero, Eq. (11.3.12) gives 


<f 2 f 4 

0(f) - 1 - - + - 

6 120(7 - 1) 


(11.3.14) 


Also, it can be shown that for y > 6/5, 0(f) vanishes at some finite f l : 

0(fi) = 0 (11.3.15) 

The radius of the star is thus given by Eq. (11.3.10) as 

/ Ky 


E = 


1 2 


P( 0) ( ^ 2)/2 f! 


\47i^(y - 1) 

We can also use the Lane-Emden solutions to calculate the stellar mass: 

*R 


(11.3.16) 


"1 

Jo 


M = 4nr 2 p(r ) dr 


\4nG(y - 1) 


47tp(0) (3,_4)/2 ( , „ Ky „ Y /2 J 4 ' fe'K-y-Vd) d( 

= 4^p(0) <3 >’- +)/2 t 


3/2 


V47r(7(y - 1) 


£i 2 l^ # (£i)l 


(11.3.17 


By eliminating p( 0) in (11.3.16) and (11.3.17), we obtain a relation between 31 
and E : 


( Ky \-Uly-V 


M = 4nE ( ‘ 3y ~*) /(y ~ 2) I ~r. 

\4nG(y — 1) 


£i 


-(3y-4)/(y- 


2) ^i 2 |0'(£i)l 


(11.3.18) 


Values 3 of the numerical constants f j and fi 2 |0 / (fi)| are tabulated in Table 11.1. 


Table 11.1. Values 5 of the Numerical Parameters and — f t 2 0'(fi) for 
Various Newtonian Polytropes 


y 



Examples 

6/5 

00 

1.73205 


11/9 

31.83646 

1.73780 


5/4 

14.97155 

1.79723 


9/7 

9.53581 

1.89056 


4/3 

6.89685 

2.01824 

Largest mass white dwarfs 

7/5 

5.35528 

2.18720 


3/2 

4.35287 

2.41105 


5/3 

3.65375 

2.71406 

Small mass white dwarfs 

2 

n 

n 


3 

2.7528 

3.7871 


00 

V6 

2\/6 

Incompressible stars 



3 Newtonian Stars: Polytropes and White Dwarjs 


311 


For Newtonian stars, M is dominated by the total rest-mass Nm N , so the 
nucleon number of the star is given to a good approximation by 

M 

N ~ — (11.3.19) 

m N 

We also want to know the internal energy E = M — Nm N . For general Newtonian 
stars this is given by Eqs. (11.1.26), (11.1.29), and (11.1.30) as 


E = T + V 

(11.3.20) 

with the thermal energy T and the gravitational energy V given by 


T - 

f* 

47rr 2 e(r) dr 

Jo 

(11.3.21) 

V = - 

* 

'R 

4:nrGJi {r)p(r) dr 

0 

(11.3.22) 

We now show that for polytropes, 
formulas 6 

T and V are given by the remarkably simple 

T — 

1 OM 1 

(11.3.23) 


(5y - 6) R 

V = 

3(y - 1) GM 2 
(5y — 6) R 

(11.3.24) 

so the total internal energy is 



E - 

(3y - 4) GM 2 
(5y — 6) R 

(11.3.25) 


To prove the formula for V, we use Eq. (11.3.4) to rewrite (11.3.22) as 


V = 4:71 


i: 


* 3 dp{ f ) 


dr 


dr = — 127T 


Id 


[ p(r) dr 


(11.3.26) 


Multiplying and dividing in the integrand by p(r), we have 


V = -3 


r 


P(r) 

p( r ) 


"R 


Ji{r) d 


\p(r)J 


(We assume here that y > 1, so that p/p vanishes at R .) This can be evaluated by 
using the equation of state to calculate 


d fp(r)\ _ fy — l\^'W _ (y — 1 \GJ?(r) 
dr\p(r)J \ y ) p(r) \ ? / r 1 
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so 


Since drjr 2 


F = -3 




GJpjr) dr 


(11.3.27) 


d{\jr), we can integrate by parts once again, and find 


= 3 (V){ 


GM 2 

~~R~ 

GM 2 

R 


f 'R 


- 2 


knrGJi (r)p(r) dr 


+ 2F 


Solving for V then gives the desired result (11.3.24). To calculate T we use (11.3.8) 
in (11.3.26), which gives 

V = -3 (y - 1 )T (11.3.28) 


Equations (11.3.24) and (11.3.28) then give the desired result (11.3.23). 

Inspection of (11.3.17) and (11.3.19) shows that the nucleon number N 
behaves like p(0) (3y_4)/2 whereas (11.3.25), (11.3.16), and (11.3.17) shown that 
the internal energy E behaves like p(0) (5y-6 ^ 2 . Thus dN/dp( 0) and 8E/dp( 0) 
can never vanish together. Theorem 1 of the last section tells us then that each 
polytrope is either stable or unstable for all p( 0), depending on the value of y. 
But which ? 

In order to answer this question, we turn to Theorem 2 of the last section, 
which tells us that the star will be stable if and only if E is a minimum with respect 
to all variations in p(r) that leave X (and the equation of state) unchanged. It is 
intuitively likely that the first instability to occur will correspond to a uniform 
implosion of the whole star, and since we are only trying to answer the yes-or-no 
question about stability with respect to this mode, it will hopefully be sufficient 
for us to consider only trial configurations with p(r) constant. 1 In any such con- 
figuration, (11.3.19), (11.3.21), (11.3.22), and (11.3.8) give 


so, eliminating U, 


N = 


(11.3.29) 


OWlpf 

T - f C, 

> - 1 )~ l Kp*B 3 

(11.3.30) 

V= — 

i(in -- 0/ Sir 

15 

(11.3.31) 

E = T + V 

= ap y _1 - bp 1/3 

(11.3.32) 
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where 


KM 

a = — 

(y - i) 

gm 5 > 3 


(11.3.33) 

(11.3.34) 


For y > 4/3, E has a minimum at 

b \i/Cy-4/3) f M 2 ' 3 G(4nl3) i/ax i/(y- 4 / 3 ) 


P = 


3a(y — 1) 


V 


5/T 


(11.3.35) 


corresponding to a configuration of stable equilibrium. For y = 4/3, E is stationary 
with respect to p only if it vanishes everywhere, which requires that a = b, or 


3/2 / 47t\“ 1/2 


M = (*X) 

0 I \ 3 


(11.3.36) 


For y < 4/3, E has a maximum at the point (11.3.35), corresponding to a state of 
unstable equilibrium. 

Incidentally, Eq. (11.3.35) gives an estimate for the mass 


M ~ 


«<3y-4)/2 

3 


15/A 3 / 2 
47l6r / 


which may be compared with the exact result (11.3.17). The ratio of these two 
expressions is 

M{ variational) (15{y — l)/y) 3/2 

Jf(exact) ~ 3{ 1 2 |0'(~{ 1 )| 

For y = 5/3 this ratio is 1.8; for y = 4/3 it is 1.2. Not only does the variational 
method give the correct dependence of M on p (including the fact that for y = 4/3, 
M is independent of p, and E vanishes), but it even provides a fair approximation 
to the exact numerical results. We can accept with confidence its prediction that a 
polytrope is stable or unstable according to whether y > 4/3 or y < 4/3. 7 

The variational approach also provides a simple method for estimating the 
oscillation frequency for dilation and contraction of the star. Equations (11.3.29)- 
(11.3.31) show that for fixed N, 

T oc R 2(1 ~ y) Foe R~ l 


We can use Eqs. (11.3.23) and (11.3.24) to fix the correct values of T and V at the 
equilibrium radius (which we shall now write as R cq . to distinguish it from the 
instantaneous radius R of an oscillating configuration). This gives then 


E = 


1 GM 2 

(5y - 6 X~ 3f) 


RHi-y) 


3(y -J) QM 2 R - l 
5y — 6 
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For y > 4/3, this has a minimum at K — R eq , as it should, Tor K near K Qq , E 
behaves like 


E 




2(5y - 6) 


R 3 

Jl eq 


The uniform dilation of a sphere with uniform density will give it a kinetic energy 


U = — MR 2 

10 


so the condition of energy conservation, that U + E be constant, leads to modes 
with 

R — R eq oc sin co 0 t 

r5|,- 1 x»r-4|g^T» |1U . 8VI 

L 5, - « I., J 

Finally, we note that a uniform sphere rotating with angular velocity Q will 
have kinetic energy 

U = iMR eq 2 Q 2 

This must be less than the binding energy — E , so the maximum angular velocity 
with which a star can rotate is of order 


5(3y - 4) GUT* ~ co 0 
(5y - 6) R cq 3 _ y/y - 1 


(11.3.38) 


Of course a star rotating this fast will no longer be a sphere, and (11.3.38) only 
gives an order-of- magnitude estimate of the actual maximum rotation frequency. 

Now let us apply what we have learned to the stars known as white dwarfs. 
Imagine an aged star that exhausts its nuclear fuel and begins to cool and contract. 
When the temperature is sufficiently low (see below for just how low), the electrons 
will be frozen into the lowest available energy levels. The Pauli principle tells us 
that there will be two electrons in each level (because of the two spin states 
available) and there are 4nk 2 (2nh)~ 3 dk levels per unit volume with momenta 
between k and k + dk , so the number of electrons per unit volume will be related 
to the maximum momentum k F by 


n = 


8tt 


(271^)' 


k F J. 3 

k 2 dk = % 


3n 2 h' 


(11.3.39) 


The mass density is 


p = nm N p 


(11.3.40) 
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where p is the number of nucleons per electron ; ji ~ 2 for stars that have used up 
their hydrogen. This gives 


\m N pJ 

The condition that the temperature is negligible is 

kT [k F 2 -f m e 2 ] 1/2 - 


(11.3.41) 


m. 


The kinetic energy density and pressure of these electrons are 


(27th)' 


f 


e = — I [{k 2 + m e 2 ) 1/2 — mj k 2 dk 


V = 


8n 


i: 


3(2^) 3 Jo (k 2 + m e 2 )^ 2 


k 2 dk 


(11.3.42) 


(11.3.43) 


[See Eq. (2.8.4).] The equation of state can be made explicit by using (11.3.41) in 
(11.3.43). 

The equation of state here is not simple, but it reduces to a polytrope in two 
extreme cases, distinguished by the criteria p p c or p > p c , where p c is the 
critical density at which k F becomes equal to m e (in c.g.s. units) : 


Pc = 


m N P ni e 3c3 


0.97 x I0 6 p gm/cm- 


3n 2 h 3 

(A) p p c . In this case k F m e , so Eqs. (11.3.42) and (11.3.43) give 

e - | V 


(11.3.44) 


p = 


8nk F 5 _ h 2 /37i 2 p\ 5/3 


16m e (2nh) 3 15m e n 2 \m n p ) 

This is a polytrope, with 

y = h k 


15 m e % \m N n 


3n 2 \ 5 > 3 


d 


Equation (11.3.17) then gives a mass (in c.g.s. units) 

/ h 3 ^ 2 

Wp 2 Q 3 ^ 2 )\p c ) 


\Pc 


2 V 8 / 

= 2.79 p~ 2 


(11.3.45) 


(11.3.46) 


(?) 


1/2 


M, 



316 


1 1 Stellar Equilibrium and Collapse 


whereas Eq. (11.3.16) gives a radius (in c.g.s. units) 

h 3 ' 2 


R = M' 2 (3.65375) ( /^°>V ^ 
\ 8 ) \c ll2 Q ll2 m e m N n j \ p c ) 

- 1/6 

km 


= 2.0 x ioy 


i ( p(Q)\ 

\ Pc ) 


(11.3.47) 


(B) p > p c . In this case k F m e , so Eqs. (11.3.42) and (11.3.43) give 

e — 3p 


P 


Snkf 4 ’ h /3tt 2 p \ 4/3 


12{2nh ) 3 12n z l m N p ) 


This is a polytrope, with 


7 = f K = 


37t 2 \ 4/3 

127i 2 


Equation (11.3.17) then gives a unique mass (in c.g.s. units) 

/ 2 : 3/2 3/2 

^ = i(3„ 1 -(2.018 24 
= 5.81fi~ 2 M Q 

whereas Eq. (11.3.16) gives the radius (in c.g.s. units) 

R = ^(3n) 1/2 (6. 89685) — V jC\ 113 

= 5.3 x 10 V 1 ( -A \ 

\p( 0)/ 


\c il2 0 ll2 m e m fl nj Vp(°)7 

1/3 

km 


(11.3.48) 


(11.3.49) 


(11.3.50) 


We note that 7 > 4/3 for p( 0 ) p c , so the least massive white dwarfs are 
definitely stable. We also see that M appears to grow monotonically with increasing 
central density, reaching a maximum (11.3.49) when p(0) -> 00 , so there is no 
point where the star can become unstable. Our tentative conclusion is that stable 
white dwarfs can exist for any mass less than (11.3.49). This maximum mass is 
known as the Chandrasekhar limit* 

Actually, matters are not so simple. When k F ~ 5m e , it becomes energetically 
favorable for electrons to be captured by nucleii, turning protons into neutrons, 
and producing neutrinos that escape forthwith. The effect is to increase the number 
p of nucleons per electron, and according to Eq. (11.3.46) this will reduce the mass 
M for a given central density. We therefore expect M to increase toward the 
Chandrasekhar limit until p(0) ~ 5 3 p c [see Eqs. (11.3.41) and (11.3.44)], where 
M reaches a maximum and then begins to decrease. Detailed calculations 9 show 
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that the maximum mass is 1.2 M Q , nearly equal to the Chandrasekhar limit, which 
is 1.2 QM q for p = 56/26. The radius of a star with this maximum mass is about 
4 x 10 3 km. Theorem 2 of Section 11.2 suggests that this maximum is a point of 
transition from stability to instability, so stable iron white dwarfs can exist only 
for M < 1.2 M 0 . 

To the student of general relativity, the most interesting parameter charac- 
terizing a white dwarf is the absolute value GMjR of the gravitational potential 
at its surface. For p(0) p c> this is given by (11.3.46) and (11.3.47) as 


GM 

~R~ 


1 / 2.71406 \ , fm /p(0)V /3 

2 ^3.65375/ \mj\p c ) 


(11.3.51) 


whereas for p(0) > p c , it is given by Eqs. (11.3.49) and (11.3.50) as 


GM _ / 2. 01824 \ _ , /m e \ / p(0) \ 1/3 

H “ \6.89685y M U J \J>~) 


(11.3.52) 


We see that GM/R is always going to be quite small, because mJm N — 5.4 x 10“ 4 . 
Thus general relativity plays no important role in the structure of white dwarfs. 
The quantity GM/R increases with increasing central density, so it is largest at the 
maximum mass 1.2 M Q , where if takes the value 4 x 10“ 4 . Our old friend 40 
Eridani B had GMjR ~ 6 x 10“ 5 (see Section 3.5), so it is not going to be 
possible to improve astronomical red- shift experiments very dramatically by 
finding white dwarfs with much larger red shifts. 


4 Neutron Stars 

We saw in the last section that a white dwarf star supported by the pressure 
of cold degenerate electrons cannot be in equilibrium if its mass is greater than the 
Chandrasekhar limit, about fi 3/2 /m N 2 G 3/2 . Also, the gravitational potential at 
the surface of such a star cannot be greater than about mjm N , so general relativity 
plays no role in its structure. 

Continuing our search for astrophysical applications of general relativity, let 
us ask what happens when a star whose mass is above the Chandrasekhar limit 
reaches the end of its thermonuclear evolution and grows cold. Its internal pressure 
then fails to support it, and it collapses. One possibility is that the star will simply 
go on collapsing forever, in which case general relativity will certainly come into 
play. Another possibility is that the star will become so heated during its collapse 
that it will explode, becoming a supernova. It might then blow off enough matter 
so that its mass drops below the Chandrasekhar limit. It is believed that in this 
case the highly compressed remnant does not find its quietus as a white dwarf, 
but rather becomes a superdense neutron star. 10 (See Figure 11.2.) 
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R{ km) 


Figure 11.2 Configurations of stellar equilibrium. The solid curves on the left and 
right represent the Oppenheimer-Volkoff 10 solution for a pure neutron star and the 
Chandrasekhar 8 solution for a pure Fe 56 white dwarf star, respectively. The dashed 
lines give the extrapolated nonrelativist ic solutions in these two cases. The dotted line 
represents the interpolating solution of Harrison, Thorne, Wakano, and Wheeler, 12 
which takes into account the shift in chemical composition from Fe 56 to neutrons. 
Arrows indicate the direction of increasing central density. As shown in Theorem 1, 
the various transitions between stability and instability occur at the maxima and 
minima of M , marked here with small circles. 


A neutron star is like a white dwarf, except that it consists almost entirely 
of “cold” degenerate neutrons, all electrons and protons having been converted into 
neutrons through the reaction 


p -he -> n -f v 

the neutrinos escaping the star. Enough electrons and protons must remain so 
that the Pauli principle prevents neutron beta decay, n —>p + e~ + v; this sets 
a lower limit on the mass of stable neutron stars, to be evaluated below. 

Neutron stars of low mass are much like white dwarfs of the same mass, 
except that neutron degeneracy pressure replaces electron degeneracy pressure, 
and thus m e should be replaced in all formulas with m n (and p should be set equal 
to unity). Thus, by noting how m e enters in the formulas (11.3.44)— (11.3.4:7) for 
small white dwarfs, we can immediately conclude that a neutron star of small 
mass will have a central density higher than that of a white dwarf with the same 
mass (and p = 2) by a factor \{m n jm e )^ = 3.1 x 10 9 , and will have a radius 
smaller by a factor mJ2m e = 920. 
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The electrons in a white dwarf will begin to be relativistic when its mass 
becomes comparable to the theoretical upper limit, given by Eq. (11.3.49). Since 
m e does not enter in (11.3.49), we expect that the neutrons in a neutron star will 
begin to be relativistic at just such masses, that is, when M is of order M Q . 
However, at this point the analogy between white dwarfs and neutron stars 
breaks down. For one thing, the total energy density p of a white dwarf is always 
dominated by the rest-mass density of its nonrelativistic nucleons, whereas a 
neutron star with mass of order M Q will consist of nucleons whose kinetic energies 
are comparable with their rest-masses. Another difference that is even more 
interesting is that, whereas a white dwarf whose electrons are moderately rel- 
ativistic will have a surface gravitational potential GM/R of order m e lm n , a 
neutron star of equal mass will have a surface potential roughly of order unity. 
Thus general relativity will necessarily play a role in the theory of the more 
massive neutron stars. 

In order to formulate the quantitative theory of neutron stars, we begin by 
writing down expressions for the total energy density and pressure of an ideal 
Fermi gas of neutrons with maximum momentum lc F : 


$iz 


V 


(2nH) 3 

$n C kF h 2 


"k 

Jo 


3(2nfi) 3 Jo (& 2 + m„ 2 ) 1/2 
where now (in c.g.s. units) 


k 2 dk = p 


rk F /m„ 

Jo 

rkr/m, 

Jo 


(k 2 + m n 2 ) 1/2 k 2 dk = 3 p c | ( u 2 -f 1 ) lf2 u 2 du 

(11.4.1) 

(u 2 + 1 )- ll2 u*du 

(11.4.2) 


Pc = 


$7l7n n 4 'C' 

3(2nh) 3 


6.11 x 10 15 gm/cm 


(11.4.3) 


By eliminating in Eqs. (11.4.1) and (11.4.2), we obtain the equation of state 
in the form 

= (11.4.4) 

Pc \PcJ 


with F a definite transcendental function. The structure of a neutron star with 
given central density p(0) is to be calculated by solving (11.1.13) with p given as a 
function of p by (11.4.4). Since the only dimensional quantities in these equations 
are p(0), p c , and G, the solution must give the mass and radius as functions of 
p(0) of the form 


M = MJ 


. Pc . 


(11.4.5) 


E = E 0 g 



(11.4.6) 
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where (in c.g.s. units) 


R 0 = c(SnGp c )~ 1/2 = 3.0 km 
r 2 R 

M _ cjto = 2 0M 

G 


(11.4.7) 

(11.4.8) 


and / and g are unknown dimensionless functions. This problem, like that of the 
white dwarfs, is analytically tractable only for very large and very small central 
densities. 

For p(0) <<c p c we can use the analogy with white dwarfs discussed above, 
and conclude from Eqs. (11.3.46) and (11.3.47) that 


M = U^Y 2 (2.71406) ( (W 

2 \ 8 ) \m„ 2 0 3/2 ) \ p c ) 


n 


= i (2.71406)lf o 


1/2 


B-fc r, 3.65375) 

8) \™„ 2 G ll2 J\p(0) 

(j±Y‘ 


1/6 


(3.65375)^ 0 


vp(<>)y 


(11.4.9) 


(11.4,10) 


with p c now given by Eq. (11.4.3). 

For p(0) > p c , the neutrons near the center of the star have k F > m n , so 
(11.4.1) and (11.4.2) give 

SpVfcA 5 PcAfY 

4 \m n ) 4 \m„) 

and therefore 


P 


P 

3 


(11.4.11) 


as would be expected for a gas of highly relativistic particles. Using this equation 
of state in the fundamental differential equation (11.1.13) gives 


Amazingly, we can find an exact solution of this equation 1 1 : 

3 


2GJ/(r) l~ 1 


P(r) = 


56nGr‘ 


(11.4.12) 


(11.4.13) 


corresponding to the limit p(0) -* oo . However, even in the limit of infinite central 
density, this p(r) will drop below p 0 at a radius r of order R 0 , so that the equation 
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of state (11.4.11) is not valid for the outer layers of any neutron star. To deal with 
the crust of nonrelativistic neutrons, it is necessary to solve the full equation 
(11.1.13) using the equation of state (11.4.4); the condition of infinite central 
density is imposed by (11.4.13) for r E 0 . We shall not do this here; the important 
points are that the solution has a finite radius E where p. vanishes, and that the 
mass M within this radius is finite, because the singularity in Eq. (11.4.13) is 
integrable at r = 0. Thus the mass and radius of a neutron star approach finite 
limits as p(0) -► 00 . Numerical solution of the fundamental equation (11.1.13) 
gives these limits as 10 

= 0.171Jf 0 E ^ = 1.06i? 0 (11.4.14) 

There remains the question of stability. For p(0) p c , a pure neutron star 
is simply a Newtonian poly trope with -y = 5/3 (like a small white dwarf) and is 
therefore stable. (See Section 11.3.) Equation (11.4.9) shows that if is a mono- 
tonically increasing function of p(0) for these small central densities. If M 
continues to increase monotonically to the value M then no transition to in- 
stability can occur, according to Theorem 1 of Section 11.2. But (11.4.9) shows 
that when p(0) = 0.016 p c (which is small enough for (11.4.9) to be a good 
approximation), the mass M is already greater than M 00 . Thus we expect that M 
rises to a maximum value if > if ^ at some central density p m of order p c , and 
then drops to the value if ^ at infinite central density. This expectation is confirmed 
by detailed calculation 10 using Eqs. (11.1.13) and (11.4.1)— (11 .4.3). The mass if 
of a pure ideal-gas neutron star reaches a maximum 

M m = 0.36if 0 - 0.7if o (11.4.15) 

at a radius 

E m = 3.2f? 0 = 9.6 km (11.4.16) 

Since this is a point where dMjdp( 0) vanishes, we expect a transition here from 
stability to instability with respect to radial oscillations. Thus (11.4.15) and 
(11.4.16) characterize a neutron star with the greatest mass and central density 
allowed by the requirement that the star be stable. The mass (11.4.15) is known as 
the Oppenheimer-Volkoff limit . Note that the fractional red shift of a spectral line 
emitted from the surface of such a neutron star is 


z 


y = - 1 


2 MjGf 
E m 

ttl , 


1/2 

- 1 = 0.13 

(11.4.17) 


[See Eqs. (3.5.3), (11.1.1), and (11.1.17).] Evidently general relativity is just 
beginning to be important for the most massive stable neutron stars. 

Of course, a neutron star cannot consist purely of neutrons, if only because 
we need a Fermi sea of electrons so that the Pauli exclusion principle can block 
the neutrons’ beta decay. In order to get a first taste of the chemical composition 
in a neutron star, let us consider the equilibrium among neutrons, protons, and 
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electrons. The energy density and number density of each one of these three 
Fermi gases are given (for i — n, p, e ) by 

Sn f *'•' m-. ? , , „ 


^) 3 ! 0 


V k z + w, k z dk 


, = _ 8 ^ r% 

1 (2ttE) 3 L 


2 dk = 


(11.4.18) 


11.4.19) 


At any given point in the star, the reactions n -> p + e + v and p + e -> n -f v 
can convert neutrons into protons and vice versa. (The neutrinos escape.) These 
reactions preserve the total number density of baryons, 

n n + n P = n B (fixed) (11.4.20) 

and preserve charge neutrality, 


( 11 . 4 . 21 ) 


But with n B fixed, the total energy density may be expressed in terms of n n alone: 


P = Pn + Pe + Pp 
( rcnj/s 
= 3(7“ 3 J 


v k 2 -\- m n 2 k 2 dk -t- 


Cln B -n n pl* 


yjk + m p k dk 


rC[n B -n n l 1/3 , 

I yJk 2 +m e 2 k 2 dkK (11.4.22) 


C = (3tt 2 ^ 3 ) 1/3 

Chemical equilibrium is reached when this function is a minimum, that is, at 


f C 2 n 2 /3 4- m„ 2 ) 1/2 — (C 2 Ui n — n„ 1 2/3 + 


— {C 2 \n B - ?ij 2/3 + m 2 ) 112 

We can solve for n p — n B — n n as a function of n n , and find 


1 


1 , 2 ( m " 2 ~ ^ ~~ m e 2 ) , ( m n 2 ~ m p % ) 2 - 2 ^e 2 ( m n 2 + + m e f ^ 


(7V 4/3 


C 2 n 213 


The nucleon mass difference Q = m n — m p and the electron mass m e are of com- 
parable magnitude and very much less than m n , so this result can be written more 
simply as . 


^ = I 

n„ 8 


4 Q P 


4 (Q 2 — m 


2} (j±\ 

\ mn.. I 


4/3 3/2 


i + 


11.4.23) 
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where p c = m^fC 3 is the critical density previously defined in Eq. (11.4.3). 

The condition for the neutrons to be stable against beta decay is that the 
electron Fermi sea should be filled up to a momentum greater than the maximum 
momentum & max of the electron emitted in neutron beta decay : 


where 


ife™ = 


[{rrin 


k F,e > ^max 


nip) 2 — 2 m e 2 (m 2 n + m p 2 ) + m e 4 ] 1/2 


2 


~ IQ 2 - m e 2 ] 112 = 1.19 MeV 

The electron Fermi momentum is given by (11.4.19) and (11.4.21) as 


"F,e 


i2^ 2/3 
J P 


mi ( m n nS 4 ^ 3 
Pc 


+ Qm. 


7 ( m„n n \ 
= m 2 I ] 

V Pc ) 

■M 


2/3 Pn p \ 2/3 


+ Q 2 — m. 


m n n n \ 213 + x 


(11.4.24) 


(11.4.25) 


(11.4.26) 


This is smallest at n n = 0, where k F e barely equals the value k max . Hence the 
condition (11.4.24) for neutron beta stability is indeed satisfied for any positive 
neutron density. 

The proton-neutron ratio (11.4.23) is large and decreasing for very small 
neutron densities, reaches a minimum for m n n n equal to the transition density 


where 


Pt =* Pc 


±(Q : 


rn. 


13/4 


m . 


= 1.28 x 10“ 4 p c 


(11.4.27) 



0 + i(Q 2 - m e 2 ) 1/2 \ 3/2 


m r 


= 0.002 


(11.4.28) 


and then rises monotonically, reaching the value 1/8 for n n m n > p c . Stars with a 
central density somewhat less than the transition value (11.4.27) are not really 
neutron stars at all, but belong to the extreme high-density branch of the white 
dwarf equilibrium solutions, and are therefore unstable. (See Section 11.3.) Thus 
we expect there to be some minimum central density of order p T , and some 
minimum mass of order 3ilf 0 (p r /p c ) 1/2 ~ 0.03ilf o [see Eq. (11.4.9)], below which 
stable neutron stars could not exist. Detailed calculations 2 show that the minimum 
mass of a neutron star is actually about 0.2 M Q . 

The small hydrogen contamination in a neutron star is more interesting than 
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might be thought, because the filling of proton and electron as well as neutron 
energy levels will block the decay of various particles besides the neutron that 
are normally unstable. For instance, the p ~ -meson becomes stable when k F e > 53 
MeV, because then the Pauli principle will block the emission of electrons in the 
decay process p~ — > e~ + v + v. According to Eq. (11.4.26), this happens when 
the density p ~ m n n n reaches the value 0.038p c , when n p — 0.005 n n . When the 
density reaches 0.107p c , with n p = 0.01 3n n , the electron Fermi momentum 
reaches the g-meson mass 105 MeV, and it becomes energetically favorable for 
electrons at the top of the Fermi sea to be converted (say by collisions) into 
p~ -mesons, with a neutrino- antineutrino pair escaping from the star. Thus 
neutron stars of even moderate mass will be contaminated with //“-mesons as well 
as hydrogen. The same reasoning leads us to expect that hyperons and various 
excited states of the nucleons and hyperons also will be stable and present in small 
amounts. 

This raises an interesting question of principle. For instance, the famous 3-3 
resonance in pion-nucleon scattering may be thought of either as a manifestation 
of the pion-nucleon force, or as a particle, the A baryon, with a mass 1236 MeV 
and the very short lifetime, 5.5 x 10“ 20 sec. Should we include the A in ideal-gas 
models of neutron stars % Normally one would think not, but for high enough 
nucleon density the Pauli principle will block the decays A -> A + 7 T, A A + y, 
and so on, and energetic considerations will favor the conversion of some neutrons 
and protons into A’s. Of course, it is possible that the strong interactions among 
nucleons simply rule out any ideal -gas model of a dense neutron star, but it is 
also possible that the effects of these forces can be taken into account by treating 
a neutron star as an ideal gas of neutrons, protons, electrons, p~ -mesons, hyperons, 
and nucleon and hyperon resonances. (See Section 15.11.) 

In any case, it should be clear that the Oppenheimer-Volkoff calculation, 
which treats a neutron star as a pure ideal gas of neutrons, must be used with a 
good deal of caution when p(0) is comparable with or greater than p c . Merely 
including protons and electrons as well as neutrons in an ideal-gas model does not 
by itself have a serious effect on the structure of a neutron star, 12 but nuclear 
forces can be quite important: For instance, various detailed calculations yield 
values of the maximum stable mass equal to 0.37 iff 0 , 13 1.95 M 0 , 14 and 2 AM Q . 1 5 
Even these models are still highly idealized ; a real neutron star is expected to have 
a crystalline crust, 16 a superfluid interior, 17 powerful magnetic fields, 18 and often 
a very rapid rate of rotation. 1 9 

The discovery 20 in 1967 of “pulsars,” stars that emit radiation at various 
wavelengths in regular pulses separated by intervals from a few seconds down to 
0.033 sec, suggests that we should look into the possible rotation and vibration 
periods of neutron stars and white dwarfs. Equations (11.3.37) and (11.3.38) 
show that for all y except 6/5, 4/3, or 1, the maximum rotation frequency and the 
fundamental vibration frequency of any Newtonian poly trope are Loth of order 
V GM/B 3 . Presumably this result holds to within an order of magnitude for any 
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stable neutron star; in this case the characteristic frequency is greatest when M 
and R take the values (11.4.15) and (11.4.16), where it has the value 



(11.4.29) 


This is considerably faster than any observed pulsar frequency. It is currently 
believed that pulsars are rotating neutron stars, 21 which begin life with a rotation 
frequency near the maximum, of order 10 4 sec -1 , but which subsequently slow 
down, losing their energy through gravitational and electromagnetic radiation and 
through electromagnetic acceleration of charged particles. (In order to account for 
the existence of this radiation as well as for the observed pulses, it is necessary to 
suppose that the star does not have circular symmetry about the axis of rotation, 
as would be the case if its magnetic poles are offset from its rotation poles.) This 
interpretation is supported by the observation that several pulsars are slowing 
down. 22 

A white dwarf with the same mass as a neutron star will have a radius larger 
by a factor mjpm e ~ 900, so the fundamental vibration frequency and the 
maximum rotation frequency will be smaller than for a neutron star by a factor 
3 x 10“ 5 . For M near M m this gives a characteristic frequency smaller than 
(11.4.29) by this factor, or about 0.3 sec - 1 . This is slower than the observed pulse 
rate of most pulsars. Pulsars are probably neutron stars, not white dwarfs. 


5 Supermassive Stars 


We now turn to a different kind of “star,” 23 in which general relativity enters 
in quite a different way. Let us consider a Newtonian star that is supported by the 
pressure of radiation rather than of matter; the conditions under which this 
occurs will be discovered as we go along. Let us also assume that the star is in 
convective equilibrium (see Section 11.1) and has uniform chemical composition. 
Radiation has an energy density e = 3 p, so this star will be a polytrope with 
y — 4/3, that is, 

p = Kp A/3 (11.5.1) 

The radiation pressure is given by the Stefan-Boltzmann law. 


n 2 (kT) 4 
45 # 3 


(11.5.2) 


so with p ~ p r , the temperature is given by 

45h 3 K \ 1/4 


. 1/3 


kT - 


(11.5.3) 



326 


1 1 Stellar Equilibrium and Collapse 


The pressure of matter here is given by the ideal-gas law : 

kT 

Pm = P~ (11.5.4) 

m 

where m is the mean mass of the gas particles. Thus the ratio of matter to radiation 
pressure is 


B s Pm = 45S 3 _p_ = I /m 3 y* 
p r n 2 m (kT) 3 m \ti 2 K 3 J 


(11.5.5) 


This is a constant throughout the star, so we can use /? instead of K (or the entropy 
per nucleon, on which they both depend) to define the equation of state, writing 


K = 


' 45 ^ 3 y/ 3 

m\ 2 p*J 


(11.5.6) 


The mass of a polytrope with y = 4/3 is given by Eq. (11.3.17) and Table 11.1 as 


M = 4:71 (2.01824) ( — 1 


fK\ 3 ‘ 2 


and, using (11.5.6), this becomes 


M = 1 ^( 2 . 01824 ) ^ 


. 3/2 


G 3l2p. 


= ISM 


oi=n r 2 

m 


(11.5.7) 


(11.5.8) 


For ionized hydrogen at temperatures between 10 50 K and 10 lo ° K, m is the 
average of the proton and the electron mass, so m ~ m N / 2. Thus in this case the 
condition for radiation pressure to dominate material pressure by, say, a factor 10, 
is that M > 7200Jf o . No such supermassive star has been definitely observed, 
but they have been considered as possible sites for the production of radiant 
energy through gravitational collapse. 23 

The structure of a supermassive star is entirely determined by the equations 
for a Newtonian poly trope with y ~ 4/3. In particular, Eq. (11.3.16) gives the 
radius of the star as 

/ K\ 1/2 

R - 6.89685 — ) p(0)“ 1/3 

\tiG J 

and, using (11.5.6), this is 


R = 


45\i/6 

7l 5 ) 


(6.89685) 


n^ 2 


m 2/3 G 1/2 p 2 ^ 3 


Pi 0) 


- 1/3 


(11.5.9) 
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The radius is restricted by our assumption that the rest-mass energy of the star 
be much greater than its radiant energy, and, a fortiori, than the thermal energy 
of its matter. This condition reads 


ti 2 (JcT) 4 

i m 3 


< p 


or, using (11.5.3) and (11.5.6), 


2 ff 4 m 4 
1215 K 3 


(11.5.10) 


The density p is greatest at the center, so this can be regarded as a condition on 
p(0); using (11.5.8) and (11.5.9) to express p and p(0) in terms of M and R , the 
condition (11.5.10) becomes 


MG 4 / 2.01824 
R ^ 3 \ 6.89685 


0.39 


(11.5.11) 


This is essentially equivalent to the statement that the gravitational potential is 
small, which also was assumed. For M = \0 4 M o , Eq. (11.5.11) requires that 
ft > 4 x 10 4 km. 

Although we do not need general relativity to understand the structure of 
these supermassive stars, we shall need it to settle the question of stability. A 
polytrope with y — 4/3 is trembling between stability and instability, so it is 
necessary to take into account the small effects of the matter pressure and of 
general relativity, which play no appreciable role in structure calculations. 

We use Theorem 1 of Section 11.2, which tells us that the transition from 
stability to instability will occur at a value of p(0) for which the internal energy E 
is stationary. To calculate E , we use Eqs. (11.1.29) — (11.1.31), which to first order 
in GMjR give 


E 


pR pR 

~ 47ir 2 e(r) dr + I 471 GrJi{r)e{r) dr 

Jo Jo 

r R 

— 4:iiGrJi(r) dr — 


(*R 


6nG 2 Jf 2 (r)p(r) dr 


(11.5.12) 


The internal energy density e is 

_ 7 ^ {kT) A 1 plcT 
e ~ 15 n 3 + r - 1 m 


r ] 


— 3pr 1 + 


P 


3(r - i) 


where T is the specific heat ratio of the matter. (For ionized hydrogen, F = 5,3.) 
The total pressure is 


P = Pr + Pm = iM 1 + P) 
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Therefore, to first order in the small parameter /?, the ratio of energy density to 
pressure is given by 


^ 3 p 


1 _ l 3r - 4 > p + 0 (j8 2 ) 
3(r - i) 


(11.5.13) 


The small correction of order /? can be ignored in the second term in (11.5.12), 
which is already smaller than the first term by a factor of order GM / R : but it 
must be kept in the large first term, and therefore 


E 


e T 1 _<”LLi) /( ip 
L 3<r - i) J Jo 

( 'R rR 

— 4nGrJt(r) dr — 6nG 2 M 2 {r)p{r) dr — • • • (11.5.14) 

Jo Jo 


'•R 

Jo 


1271 r 2 p{r) dr + I \2jzGrM [r)p{r) dr 


The first integral can be rewritten by integrating by parts : 


*R pR 

\2nr 2 p{r ) dr — p(r) d(47rr 3 ) = — 

Jo Jo 




4nr 3 p r (r) dr 


To calculate p'(r), we expand the fundamental equation (11.1.13) to first order in 
GM/R: 

— r 2 p'(r ) ~ GJi{r)p{r ) 
so 


+ p(r) + 4:n r 3 p(r) 2 GJt(r) 
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The ^-correction needs to be kept in only the first term, which is larger than the 
others by a factor of order RjMG, so to first order in and GMjR, Eq. (11.5.14) 
reads 


E ~ 


— B 4:7iGrJ/ (r)p(r) dr + 

3(r - i) J© 


Jo 


IQnGrM (r)p(r) dr 
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y R 

167T 2 Gr 4 p{r)p{r) dr + 
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2iiG 2 M 2 {r)p{r) dr 
o 


(11.5.15) 


Now every term is small, so they can all be evaluated using for p, p, and Ji the 
values obtained by solving the Newtonian equation 


— r 2 p'(r) ~ GM(r)p{r) 
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for a bJewtonian polytrope with y = 4/3. In particular, the first integral in Eq. 
(11.5.15) is given by setting y — 4/3 in Eq. (11.3.24), 

C R 3CM 2 

I 4t iGrJi{r)p{r) dr = — V = 

whereas an integration by parts lets us write the third term as 


r rR 

i a—2n„A J 2, 


IQn Gr p(r)p(r) dr = 4nr p(r) dJi{r) 


pR rR 

- 4nGr 2 p f (r)'Jf {r) dr — 87 iGrp(r)Ji{r) dr 

Jo Jo 

pR rR 

j 47 iG 2 Ji 2 (r)p(r) dr — | 871 Grp{r)Ji(r) dr 


Equation (11.5.15) now reads 


/op 1 41 cm 2 r R r R 
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The last two integrals may be calculated in terms of the Lane-Emden function 


0(£) for y = 4/3 


6^ 7/ vi o i^! p 
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r R Q lf7/2 (f p2/3 /•{, 

I SnGJf(r)p(r)r dr = —J^L- j |0'({)|(C({) 


whereas K and p(0) can be expressed in terms of M and R by using (11.3.16) and 
(11.3.17), yielding 

K V2 p( 0) 2 ' 3 _ -Jn GM 2 
G 312 ~ 64^ 1 4 |0'(<J 1 )| 3 R 

A numerical integration gives 24 


o e { P* { 3 ie'({)l0 4 ({) «*{ + r P* £ 4e ' 2 ® 03 <«> <4 = 61 

87T<f 1 4 |0'{f 1 )| 3 in 4n n I 


so, putting this all together, we have at last 
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The star is certainly stable when R is so large that general relativity can be 
neglected, for then the star behaves like a Newtonian poly trope with 


7 = 1 



(3r - 4) 
9(r - i) 


P > 


4 

3 


[See Eq. (11.5.13).] The transition from stability to instability will occur when R 
decreases to a value where 


dE __ dE dp( 0) _ 
dR ~ dp( 0) 6R 


The derivative must be taken with constant entropy per nucleon, and hence in 
this case with /? fixed and M fixed. [See Eqs. (11.5.6) and (11.5.7).] Thus the 
minimum radius for stability is 


R. 


20.4 (r - 1) GM 
~ (3r - 4) y 


(11.5.17) 


The maximum energy that can be released by letting the star shrink slowly 
(through radiation at its surface) to this minimum stable radius is 




(3r - 4) 2 p 2 M 

8i.6 <r - 1) 2 


(11.5.18) 


For instance, a star with /? = 0.1 will have M ~ 7200 M 0 ; if T = 5/3 then the 
minimum radius is 1.45 x 10 6 km, and the fraction of its rest-mass that can be 
released by assembling the star is 0.03%. The maximum value of the surface 
potential MG/R for F = 5/3 is 0.0735/?, well under the limit (11.5.11). 


6 Stars of Uniform Density 

General relativity finds an interesting application to one other class of stable 
stars, those consisting of incompressible fluids, with equation of state 

p = constant (11.6.1) 

These stars are of interest, not because they actually exist (they don’t), but because 
they are simple enough to allow an exact solution of Einstein’s equations, 25 and 
because they set an upper limit to the gravitational red shift of spectral lines from 
the surface of any star. 26 

With p constant, the fundamental equation (11.1.13) may be written 
= inGr\ 1 - 

[p + pirMpm + P(r)l L 3 J 


( 11 . 6 . 2 ) 
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The pressure must now be determined by integrating inward from the surface 
where p = 0, rather than outward, as for more realistic models. This gives 

p(r) + p _ f l - 87r£pi? 2 /3 T /2 
3 p(r) -f p [_ 1 — 8nGpr 2 jZ J 


Solving for p(r), and expressing p in terms of the stellar mass, 


3 M 
4nR 3 


for r < E 


we find 


P(r ) 


3if ( [1 - (2 MGjR)\ 112 - [1 - (2 MGr 2 jR 3 )fl 2 
4n R 3 j[l - (2 MGr 2 jR 3 )Y‘ 2 - 3[1 - {2MGjR^ 2 


The metric component A(r ) is immediately given by Eq. (11.1.11): 


A(r) 


1 - 2MGr z ~\~ 1 
_ & _ 


(11.6.3) 


(11.6.4) 


(11.6.5) 


whereas B(r) can be calculated by using (11.6.4) in the integral (11.1.16) 

2MGr 2 \^ 2 l 2 


B(r) 




(11.6.6) 


The most interesting feature of this solution is that it does not make sense for 
all values of M and R . The pressure given by Eq. (11.6.4) will become infinite at a 
point r ^ where 


r 
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00 


9 R 2 


4 R 3 
~MG 


(11.6.7) 


(Also, the metric becomes singular at r ^ because Bfr^) vanishes.) But the pressure 
is a scalar, and so an infinity in p(r) cannot be blamed on an injudicious choice of 
coordinate system. We must see to it that p(r) is not singular for any real r, and 
the only way to accomplish this is to have r ^ negative, or 


MG 4 
~R < 9 


(11.6.8) 


Note that the Schwarzschild radius 2MG is then less than 8/9 the actual radius R . 
so there is no singularity in either the exterior solution (11.1.17) or the interior 
solution (11.6.5), (11.6.6). 

This is not the first time that we have discovered an upper bound on the 
absolute value MGjR of the gravitational potential of a star. We learned in Section 
11.4 that for a stable ideal-gas neutron star, MGjR is never greater than 0.36/3.2, 
or 0.11. [See Eqs. (11.4.15) and (11.4.16).] Is there then an absolute upper limit to 
MGjR imposed by the structure of the Einstein equations, irrespective of the 
equation of state ? 
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To frame this question as a mathematical problem, we consider p as an 
arbitrary finite positive function, subject only to these general requirements: 

(A) The radius R is fixed, with 

p(r) ~ 0 for r > R (11.6.9) 

(B) The mass M is fixed, with 

'R 

4 nr 2 p(r)dr = M (11.6.10) 

Jo 

(C) The metric coefficient A(r) given by (11.1.11) must not be singular, so 


where 


Jt(r) < 
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f% r 

4-nr l2 p(r') dr' 
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(D) The density p(r) must not increase outward : 


p'(r) < 0 


( 11 . 6 . 11 ) 


( 11 . 6 . 12 ) 


(It is difficult to imagine that a fluid sphere with a larger density near the surface 
than near the center could be stable.) Given any function p(r), satisfying these 
conditions, we can calculate A(r) from Eq. (11.1.11); we can determine p(r) by 
integrating Eq. (11.1.13) inward from the surface (with the boundary condition 
that p(R) = 0); and we can then calculate B(r) from Eq. (11.1.16). Equation 
(11.6.11) guarantees that A(r) is well behaved, and as long as p(r) is finite, Eq. 
(11.1.13) will give p(r) > 0, and Eq. (11.1.16) will give a finite positive-definite 
B(r). Thus any absolute limitations on the input function p{r) (such as an upper 
bound on MGjR) can only come from the condition that Eq. (11.1.13) must yield 
a finite solution for the pressure p(r). 

We shall exploit this condition rather indirectly, by concentrating on the 
metric coefficient B(r) rather than on p(r) itself. We first derive an equation that 
allows B(r) to be calculated for a given density function p(r), without having to 
solve for p(r); from (11.1.5) and (11.1.7), we have 
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This equation can be linearized by defining 

B = f 2 

Introducing Eq. (11.1.11) for A(r), and rearranging a bit, we find 


11.6.13) 


2GJi{r) 


i,2 m = G ( ] 

dr j \ 


2GJt{r)Y 1 ' 2 (M{r 


(11.6.14) 


The initial conditions at r = R can be determined directly from Eq. (11.1.16), 
or from the condition that B(r) fit smoothly to the exterior solution (11.1.17); 
either way, we find that 


_ 2 M(f 
R 


C(R) = 


(11.6.15) 


(11.6.16) 


The solution for £(r) must be positive, because £(r) can become negative only if it 
passes through the value zero, at which point B would vanish, and, according to 
Eq. (11.1.16), B can vanish only if the pressure p(r) has a singularity. 

We next proceed to derive an upper bound for 4(0). If f is positive, then the 
right-hand side of (11.6.14) is negative, because 3^(r)/47rr 3 is the mean density 
within the radius r, and the mean density cannot increase with r if the density 
does not. Thus (11.6.14) gives 


1 L _ 2 GMiffU’ 2 d((r) ~ 
r\ r J dr 


the equality being attained only for uniform density. Integrating this inequality 
from r to R and using (11.6.16), we have 


C(r) > 


2GJt{r) 


Integrating again from 0 to R and using (11.6.15) gives 


m < i - 


L R \ X* Jo [1 - ^GJt{r)lr)Y 2 

The right-hand side is largest when M(r) is as small as possible. For a given mass 
M and radius R, the density distribution with p'(r) < 0 that gives an Jl{r) that 
is everywhere as small as possible has p(r) constant, in which case 


Jt(r) = 
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Using this in the integral, our inequality is 




2MG1 112 

R 


1 

2 


(11.6.17) 


We have already noted that £( r ) must be positive-definite; hence (11.6.17) implies 
that 


MG 4 

< 

R 9 


(11.6.18) 


This is just the upper limit found earlier for stars of uniform density, but now we 
know that (11.6.18) holds for all stars, uniform or not. 

It can also be proved that for a given mass and radius, the stable stars with 
smallest central pressure are those with uniform density. Hence the central pressure 
of any star is not less than the value obtained by setting r = 0 in Eq. (11.6.4), 
that is, 


( 0» > A M f [! - (2MG/£)] 1/2 - 1 
P{ j " 4 ti B 3 jl - 3[1 - (2MG/fi)] 1/2 


(11.6.19) 


This again shows that MG/R can never equal the forbidden value 4/9. 

Our result can be immediately translated into a statement about the red shift 
of spectral lines from the surface of any star. According to Eqs. (3.5.3), (11.1.1), 
and (11.1.17), this is 


2 


AA 


- B~ 1/2 (R ) - 1 


2 MG\~ 112 

~ir) 


Equation (11.6.18) imposes on z the upper bound 


z < 2 (11.6.20) 

In fact, there seems to be a large concentration of quasi-stellar radio sources (see 
Chapter 14) whose spectral lines show red shifts close to 1.95! However, we 
should not jump to the conclusion that these red shifts are necessarily due to 
strong gravitational fields, for red shifts near z — 2 require the star to be composed 
of a nearly incompressible fluid, with dp/dp very small. This would seem un- 
physical, since we do not want the speed of sound (dp/dp ) 1/ 2 to become larger than 
the speed of light! 26 Bondi 27 has shown that for a stable star with (dp/dp) < 1 
and with p/p < 1/3 (as is the case for particles that interact only electromagnet- 
ically and/or in localized collisions; see Section 2.10) the red shift of spectral lines 
emitted from the surface is bounded by 2 < 0.615. In any case, there are quasi- 
stellar objects with red shifts z > 2, such as 4C25.5, with 2 = 2.358. 

However, there is no theorem that limits the red shifts of light signals from 
the interior of static spherically symmetric bodies/ 8 For instance, a light signal 



7 Time-Dependent Spherically Symmetric Fields 


335 


from the center of a transparent uniform star would have a red shift given by Eqs. 
(3.5.3), (11.1.1), and (11.6.6): 


1 + z = B~ 1/2 ( 0) 


2 

3(1 - (2MG/R)) 1 ' 2 - 1 


As MG IB approaches the maximum value 4/9, this red shift becomes infinite. 
Hoyle and Fowler 29 have suggested that a quasi- stellar object can consist of a 
cluster of small dense stars, with the red shifts arising from emission and absorp- 
tion in a hot cloud of gas trapped near the cluster center. It is not yet clear whether 
the red shifts of the QSO’s arise internally, or from some other cause, such as the 
general cosmological recession of distant objects discussed in Chapter 14. 


7 Time-Dependent Spherically Symmetric Fields 

We now turn to the problems of stellar dynamics, and begin by writing down 
the metric and Ricci tensor for a spherically symmetric but time-dependent 
system. Spherical symmetry requires the proper time interval dr 2 to depend only 
on the rotational invariants 

t, dt , r, x * dx = r dr, dx 2 = dr 2 -f r 2 (d6 2 4- sin 2 0 dtp 2 ) 
so it can be written 

dr 2 = C(r, t) dt 2 — D(r, t) dr 2 — 2E(r, t) dr dt — F(r, t)r 2 (d9 2 -f sin 2 0 dtp 2 ) 
The function F can be removed by defining a new radial variable 

r' = rF 1/2 {r, t) 

The metric will then be of the same form, but with new functions C' , D r , E' in 
place of 0, D, E, and of course with r' in place of r and no factor F. Dropping 
primes, we have then 

dr 2 — C(r , t) dt 2 — D(r, t) dr 2 — 2E(r, t ) dr dt — r 2 (d9 2 -f sin 2 0 dtp 2 ) 

We next remove E, by defining a new time 

dt ' = r](r, t)[C(r, t) dt — E(r, t) dr] 

where r] is an integrating factor defined to make the right-hand side a perfect 
differential, that is, so that 

“ [q(r, t)C(r, 01 = M r > t ) E i r > 0] 

dr dt 
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(This equation can be solved by treating it as an initial value problem; given 
rj(r, t Q ) for all r, we can solve for drj(r, t)jdt8itt = t 0 and thus determine t](r, t 0 + dt) 
for all r.) The proper time is then 


dr 2 = rj 2 C 1 dt' 2 — (D + C 1 E 2 ) dr 2 — r 2 (d6 2 + sin 2 9 dtp 2 ) 

or, introducing new functions A and B in place of D + C~ 1 E 2 and t]~ z C~ 1 and 
dropping the prime on t, 

dr 2 = B(r, t) dt 2 — A{r, t ) dr 2 — r 2 (d6 2 -f sin 2 6 dcp 2 ) (11.7.1) 

Thus we can use the metric in its familiar “standard” form, the only new feature 
being that A and B now depend on t as well as r. 

The nonvanishing elements of the metric tensor and its inverse are 


9 


= A 
= A' 1 


9ee — r ' 


9 = r 


9 (p<p 

g V<P 


— r 2 sin 2 6 
= r~ 2 (sin 0)' 


9u = 

a" = -B~ l 


It follows that the nonvanishing elements of the affine connection are 

r sin 2 9 


(11.7.2) 




A' 

2A 


r r — — 

1 a a — 


r = 

A <p<p 


r r = 

1 tt 


B' 


2 A 

FL = —sin 9 cos 9 


r r 

A rt 


r r = 

1 tr 


Y<P — Y<P = ~ 
L (pr A r<p 


2A 

1 


rt = t % = - 


1^=0, = cotd 


r; r = +— 

2 B 


r' 

1 tt 


2 B 


r* — r r = — 

tr n 2 B 


(11.7.3) 


(A prime or a dot now denotes d/dr or djdt, respectively.) From (6.1.5) we obtain 
the independent nonzero components of the Ricci tensor: 


R - B " B ' 2 

" ~ 2 B ±B 2 




_BT_ ETA 
2 A + 4 A 2 


A'B' 
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' B' 
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! Ar 
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4AB 

K = 
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~ Ar 



(11.7.4) 

(11.7.5) 

(11.7.6) 

(11.7.7) 
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Also, it follows from the spherical symmetry of the metric that 

= sin 2 6M te (11.7.8) 

R re = R rv = R ev = R m = R v , = 0 (11.7.9) 

As a simple but important application of these results, let us consider a 
spherically symmetric but not necessarily static field in empty space , where the 
field equations read R = 0. According to (11.7.7), the field equation R tr = 0 
just tells us that A is time-independent : 

A = 0 


Inspection of (1 1.7.4)— (1 1 .7.6) then shows that all time derivatives drop out of 
the field equations, and they become identical with the equations for a static 
isotropic gravitational field in empty space. [See Eq. (8.1.13).] We can then repeat 
the arguments of Section 8.2 ; the vanishing of R rr and R tt gives 


(ABy = 0 


and the vanishing of R 0e gives 



Since A is time-independent, the general solution is 


A = 1 


2MG 


A - 1 


B = f(t)l l - 


2 MG\ 


with GM a time-independent integration constant, and f(t ) an unknown function 
of t. The function f(t) can be made to equal unity by defining a new time coordinate : 


V - 


f 1/2 (t) dt 


The metric is now entirely time-independent, and agrees with the Schwarzschild 
solution (8.2.12). We have thus proved the Birkhoff theorem, 20 that a spherically 
symmetric gravitational field in empty space must be static, with a metric given 
by the Schwarzschild solution. 

The Birkhoff theorem is analogous to the result proved by Newton in his 
theory of the lunar motion, that the gravitational field outside a spherically 
symmetric body behaves as if the whole mass of the body were concentrated at the 
center. It is a little surprising that this result should apply in general relativity 
as well as in Newton’s theory, for in general relativity a nonstatic body will 
usually radiate gravitational waves. The Birkhoff theorem tells us that, although 
a pulsating spherically symmetric body can of course produce nonstatic gravita- 
tional fields within its mass, no gravitational radiation can escape into empty 
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space. In this sense, the Birkhoff theorem is analogous to the well-known result 
of atomic theory, that a photon cannot be emitted in a quantum transition between 
two states of zero spin. 

The Birkhoff theorem may be applied, not only to the gravitational field 
outside a body, but also to the field inside an empty spherical cavity at the center 
of a spherically symmetric (but not necessarily static) body. In this case the metric 
is again given by the Schwarzschild solution, but since the point r = 0 is here in 
empty space, there can be no singularity, so the integration constant MG must 
vanish. The Birkhoff theorem thus has the corollary that the metric inside an empty 
spherical cavity at the center of a spherically symmetric system must be equivalent to 
the flat-space Minkowski metric rj flv . This corollary is analogous to another famous 
result of Newtonian theory, that the gravitational field of a spherical shell vanishes 
inside the shell. Stars do not usually have holes at their centers, so this corollary 
will not be of much use to us in this chapter. Its importance arises from the fact 
that the Birkhoff theorem is a local theorem, not depending on any conditions on 
the metric for r -> oo (aside from spherical symmetry), so that space must be flat 
in a spherical cavity at the center of a spherically symmetric system, even if the 
system is infinite — even, in fact, if the system is the whole universe. We shall see 
in Section 15.1 that the corollary to Birkhoff 5 s theorem can be used to justify a 
limited use of Newtonian mechanics in cosmological problems. 


8 Comoving Coordinates 

As a further preparation for our treatment of gravitational collapse, and also 
to lay a groundwork for our discussion of cosmology in Chapter 14, we now con- 
struct a very useful set of coordinates, the comoving coordinate system , 31 which 
incorporates a more natural separation between space and time than that provided 
by the standard coordinates used in the last section. 

Imagine a finite region of space filled with a dense cloud of freely falling 
particles. Each particle is assumed to carry along a little clock, and is given a 
fixed set of spatial coordinates, which can be defined as the coordinates x l of the 
particle, in some arbitrary system, when its own clock reads t — 0. (The rules for 
setting these different clocks are discussed below.) The space-time coordinates x, i 
of any event are defined by taking x as the spatial coordinate label of the particle 
that is just going by when and where the event occurs, and by taking t as the 
time then shown on that particle’s clock. We may think of the coordinate mesh as 
being dragged along by the cloud of particles, with time defined by clocks stuck 
on the mesh. This coordinate system will be useful throughout the region occupied 
by the particle cloud, for whatever interval of time in which particle trajectories 
do not cross. 

The metric in comoving coordinates is characterized by certain specially 
simple features. First, we note that the clocks are in free fall and therefore tell 
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proper time, so the proper time interval between two points x, t and x, t + dt on 
a given particle’s trajectory is just dt, that is, 


and therefore 


dr 2 = ~~9nv dx* dx v = -g tt dt 2 


9tt = “1 


( 11 . 8 . 1 ) 


Also, we note that the particle trajectory x = constant, t = r satisfies the 
equation of free fall, so 

_ d 2 x i ; dx“ dx v _ H 
dr 2 " v dr dx " 

Using (11.8.1), this gives 


or, since g lj is generally a nonsingular matrix, 


II 

o 

(11.8.2) 

We have kept open the option of setting the clocks attached to the different 
particles in an arbitrary fashion. Suppose that we reset these clocks by a transforma- 

tion 


t f == t -f /(x) x' = X 

(11.8.3) 

The new metric will have the elements 


g'„ = -i 

(11.8.4) 

, df 

= 9l ‘ + ^ 

(11.8.5) 

, df df df df 

;/ ",v ^,V 

(11.8.6) 


It would be a great simplification if the function / could be chosen so that the two 
terms in Eq. (11.8.5) cancel, giving g' it = 0. There are two important cases where 
this is possible: 


(A) Suppose that we can reset all clocks so that all particles are at rest at a 
time t — 0. This assumption can be given an absolute physical significance by 
intepreting it to mean that for each particle P at t — 0, it is possible to find a 
locally inertial coordinate system x M in which the separation between P and 
neighboring particles is purely spatial, 



= 0 
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and in which the movement of P in a time interval dt is purely temporal, 

*) =0 

dt Jt=0,x = x p 

The metric in this locally inertial system is the Minkowski metric ?y /iV , so the space- 
time components of the metric in the comoving system at t = 0 are 


9h( x p > 0 ) 


d£ti 


= o 


With (11.8.2), it follows that g ti vanishes everywhere, so the metric is given by 

dr 2 = dt 2 — < 7 /j? (x, t) dx l dx-> (11.8.7) 

(B) If the metric is manifestly spherically symmetric, then the line element 
must have the general form with which we started in the last section, that is, 


dr ‘ 


C(r, t) dt 2 — D(r, t) dr 2 — 2 E(r, t) dr dt — F(r, t)r 2 (d6 2 -f sin 2 6 d(p 2 ) 


The only nonvanishing time-space component g tJ - is g tr = 2 E, and (11.8.2) then 
tells us that E is time-independent, so 

9tr = 2 E(r) 

9 to = g, 9 = o 


We can therefore eliminate the components g tJ by resetting the clocks as in 
(11.8.3), with 


/= -2 


E(r) dr 


Using (11.8.4) and dropping primes, the metric is now of the form 

dr 2 = dt 2 — U(r, t) dr 2 — F(r, t)(d6 2 -f sin 2 6 dtp 2 ) (11.8.8) 

with U and V new unknown functions that replace D and F. 

It is of course possible to construct coordinate systems of this sort even if the 
cloud of freely falling particles is purely imaginary. In differential geometry, 
coordinate systems satisfying (11.8.1) and (11.8.2) are called Gaussian , and if g ti 
vanishes, so that the line element takes the form (11.8.7), then we call the coor- 
dinates Gaussian normal. However, these coordinate systems find their most 
important applications to systems that actually do consist of a freely falling fluid. 
In this case the fluid velocity four- vector by definition has zero space component, 

U l = 0 (11.8.9) 

and since U 11 is normalized so that 


9 U' 


— 1 


( 11 . 8 . 10 ) 
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[see Eq. (5.4.4)] the time component of U M must be 

U ’ = (- g „ V 1/2 = 1 (11.8.11) 

We shall be working only with spherically symmetric comoving coordinate 
systems, with line element (11.8.8). The nonvanishing elements of the metric 
tensor are 

9„ = V g ee = V g„ = V sin 2 9 g„ = - 1 

f'=U~ 1 g M = F“‘ g vv = (Fsm 2 0)-‘ ? " = - 1 

(11.8.12) 

The nonvanishing elements of the affine connection are readily calculated as 


r , - u ' r r - 
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= — sin 6 cos 6 

= cot Q 

K v = - sin 2 6 (11.8.13) 


(A prime or dot denotes d/dr or d/dt, respectively.) From (6.1.5) we obtain the 
independent nonzero components of the Ricci tensor : 
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(11.8.14) 

(11.8.15) 

(11.8.16) 

(11.8.17) 


Also, it again follows from the spherical symmetry of the metric that 

R <p<p = R ee sin2 0 (11.8. 18) 


R re — R r<p — R e<p — R et — R (pt — 0 


(11.8.19) 
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9 Gravitational Collapse 


We saw in Sections 11.3 and 11.4 that a cooling star of mass greater than a 
few solar masses cannot reach equilibrium as either a white dwarf or a neutron 
star. It may be that a massive star will always eject enough matter by the time it 
reaches the end of its thermonuclear evolution so that its mass drops below the 
Chandrasekhar or the Oppenheimer-Volkoff limits. If not, then it will collapse. 

A proper treatment of gravitational collapse would be prohibitively compli- 
cated for this book. In order to get some feeling for what can happen during 
collapse, we consider only the simplest case, 32 the spherically symmetric collapse 
of “dust” with negligible pressure. Since the dust particles are acted on by purely 
gravitational forces, they fall freely, and we can use them as the physical basis 
of a comoving coordinate system of the sort discussed in the last section. The 
metric is then given by Eq. (11.8.8): 

dr 2 = dt 2 — U(r, t) dr 2 — V(r, t)(d6 2 + sin 2 6 dtp 2 ) (11.9.1) 

The energy-momentum tensor for a fluid of negligible pressure is given by Eq. 
(5.4.2) as 

T* v = p U (i U v (11.9.2) 

where p(r , t } is the proper energy density and 27^ is the velocity four- vector, given 
for a comoving coordinate system by Eqs. (11.8.9) and (11.8.11): 

u r = u e = U« = 0 , 27' = 1 (11.9.3) 


The equations of momentum conservation (T'h) = 0 are automatically satisfied, 

and the equation for energy conservation reads 


0 = (2^ = - e -r- P rl,= -^-p/A + Z 

ct dt 1 227 V 


or in other words 


ct 


{pV V 27) = 0 


where 


The Einstein field equations can be written 

^ = T, v - ig^.T\ = p[\g, v + 27 , 27 ,] 


(11.9.4) 

(11.9.5) 

(11.9.6) 


This may he evaluated with the aid of Eqs. (11.9.1) and (11.9.3) ; we find that the 
only non vanishing components of S }lv are 
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rr 




s <p<p = s ee sin 2 0 
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(11.9.7) 



9 Gravitational Collapse 


343 


In particular, 

S tr = 0 (11.9.8) 

Using (11 .9.7)— (1 1 .9.8) and (1 1.8.14) — (11.8.17) in (11.9.5) yields four field equations : 
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= -4 nGp (11.9.9) 


= -4nGp (11.9.10) 


= 4 7T G p 


= 0 


(11.9.11) 

(11.9.12) 


Let us simplify our model even further, and assume that p is independent of 
position. 32 We can now seek a separable solution, with 

U = R 2 (t)f(r) V = S 2 (t)g(r) 

Then (11.9.12) requires that S/S equal RjR, so we can normalize / and g so that 


S(t) = R(t) 

Also, we are still free to redefine the radial coordinate as an arbitrary function 
r of r, and in particular we can choose r — \Jg(r), so / and g are replaced with 
/ = fg' 2 l4g and g = f 2 . Dropping the tildas, we have then 

U = R 2 (t)f(r) V = R 2 (t)r 2 (11.9.13) 

Equations (11.9.9) and (11.9.10) then read 

_ /yi _ R(t)R(t) - 2 k 2 (t) = -4nGR 2 (t)p(l) (11.9.14) 

r}Hr) 




i +-2- - Lg-1 - m)R{t) - 2 U 2 (t) = -4 nGR 2 (t)p(t) 

r 2 rf 2 (r) 2r/ 2 (r)J 

(11.9.15) 


The first terms in (11.9.14) and (11.9.15) must evidently be equal constants, which 
we shall call — 2k : 
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The unique solution is 

m = [i - ^ 2 ] _1 

so the metric takes the form 

dr 2 = dt 2 — R 2 {t) f~ — — - + r 2 dO 2 + r 2 sin 2 6 d(p 2 1 (11.9.16) 

|_1 — hr 2 J 

(Incidentally, this metric is spatially homogeneous as well as isotropic, and for 
this reason it will provide the kinematic framework for our treatment of relativistic 
cosmology in Chapter 14.) 

Our remaining problem is to calculate the functions p(t) and R(t). Using 
(11.9.13) and (11.9.14) in the energy- conservation equation (11.9.4), we find that 
p{t)R 3 (t) is constant. We normalize the radial coordinate r so that 

R{ 0) = 1 (11.9.17) 

and therefore 

p(l) = p(0)ir 3 (<) (11.9.18) 

The field equations (11.9.14) or (11,9.15) and (11.9.11) are now ordinary differential 
equations : 

- 2i - R(t)R{t) - 2 R z (t) = -‘LnGp(0)R~ 1 (t) (11.9.19) 

R(t)R(t) = p(0)R~\t) (11.9.20) 

3 

We can eliminate R{t) by adding these two equations, and find 

R*(t) = -jfc + — p(0)iJ _, (0 (11.9.21) 

3 

Equations (11.9.19) and (11.9.20) can be recovered from (11.9.21) and its time 
derivative, so we can forget about them and simply use (11.9.21) to calculate R(t). 

We shall now assume that the fluid is at rest (in standard coordinates) at 
t = 0, so 

A(0) = 0 (11.9.22) 

and therefore (11.9.21) and (11.9.17) give 

t = 8 ?-?p( 0) (11.9.23) 

3 

Thus Eq. (11.9.21) can be written 


R 2 {t) = Jc[R~ 1 (t) - 1] 


(11.9.24) 



9 Gravitational Collapse 


345 


The solution is given by the parametric equations of a cycloid : 

t 


_ / V + sin i/A 

V 2jic ) 


R — -^(1 + cos ip) 

Note that R(t) vanishes when \p = n, and hence when t = T, where 

r _ * nf 3 

2y/k 2\SnGp(0)J 


(11.9.25) 


(11.9.26) 


Thus a fluid sphere of initial density p( 0) and zero pressure will collapse from rest to 
a state of infinite proper energy density in the finite time T. 

Although the collapse is complete at a finite coordinate time t — T, any 
light signal coming to us from the sphere’s surface will be delayed by its grav- 
itational field (see Section 8.7), so we on earth will not see the star suddenly vanish. 
To make this more specific, we have to complete our calculation by finding the 
metric outside the star. 

The Birkhoff theorem proved in Section 11.7 shows that it is always possible 
to find a “standard” coordinate system r, 6, ip , t in which the metric outside the 
sphere takes the form 

1 - 1 df 2 - f 2 M 2 - f 2 sin 2 B dip 2 

(11.9.27) 



But this metric is not in the Gaussian normal form (11.9.1), so in order to match 
solutions at the surface we either have to convert the interior solution (11.9.16) 
into standard coordinates, or the exterior solution (11.9.27) into Gaussian normal 
coordinates. We choose the former course. 32 

Inspection of Eq. (11.9.16) shows immediately that the standard spatial 
coordinate r, 0, ip must be chosen as 


r — rR(t), 6 = 6, (p = <p 


(11.9.28) 


In order to define a standard time coordinate such that dr 2 does not contain a 
cross-term df dt, we employ the “integrating factor” technique described in 
Section 11.7, which gives 


where 



(11.9.29) 


(1 - R(t)) 


(11.9.30) 
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The constant a is arbitrary, but may conveniently be chosen as the radius of the 
sphere in comoving coordinates. It is straightforward to check that the metric in 
the coordinate system r, 6 , (p, l takes the standard form 


with 


dr 2 = B(r , t) di 2 — A(r , 1) dr 2 — r 2 {dB 2 + sin 2 9 dtp 2 ) 
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R / 1 — kr 2 \ 1/2 (1 — Jca 2 IS) 2 
S \l — lea 2 ) (1 — kr 2 /R ) 


A = 1 



(11.9.31) 

(11.9.32) 


it now being understood that 8 is a function of i defined by Eq. (11.9.29) and that 
r and R(t) are functions of f and 8, or f and f, defined by solving Eqs. (11.9.28) 
and (11.9.30). This is a mess, but at the radius r = a of the star (a constant, since 
r is a comoving coordinate) we have 



(11.9.33) 

(11.9.34) 

(11.9.35) 

(11.9.36) 


(Eq. (11.9.34) could have been obtained by integrating the equations for free fall 
given in Section 8.4.) Comparing with (11.9.27), we see that the interior and 
exterior solutions fit continuously at f — aR(t) if 


2 MG 


With (11.9.23), this just says that 


(11.9.37) 


M = — p(0)a 3 (11.9.38) 

3 

not a surprising result ! 

Now w^e return to the problem of calculating the behavior of light signals 
emitted from the surface of the collapsing sphere. A light signal emitted in a 
radial direction at a standard time l will have drjdt given by Eq. (11.9.27) and the 
condition dr = 0, so it will arrive at a distant point r at a time 


r c. - 


V = i + 


2 MG\~ 


dr 


(11.9.39) 
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The most striking consequence of Eqs. (11.9.39) and (11.9.34) is that both i and 
V approach infinity when the radius (11.9.33) of the sphere approaches the 
Schwarzschild radius 2 GM, that is, when 

R(t) -► = lea 2 (11.9.40) 

a 

The collapse to the Schwarzschild radius therefore appears to an outside observer to 
take an infinite time , and the collapse to R = 0 is utterly unobservable from outside. 

Although the collapsing sphere does not suddenly disappear, it does fade out 
of sight, because light from its surface is subject to an increasing red shift. The 
proper time for a light source on the sphere’s surface is just the comoving time t , 
so the comoving time interval between emission of wave crests at the surface 
equals the natural wavelength X 0 that would be emitted by the source in the 
absence of gravitation. The standard time interval dt' between arrivals of wave 
crests at 7 is the observed wavelength X' ; thus the fractional change of wavelength 
is 



Using (11.9.24) to determine R{t), this is 

(11.9.41) 

In order to see how the red shift z varies with V, let us assume that the sphere is 
initially very much larger than its Schwarzschild radius 

lea 2 = <g 1 (11.9.42) 

a 

and distinguish two periods in the history of the collapse : 

(A) Until t gets close to T, we have 

S, - * 


Using (11.9.42) and (11.9.43) in (11.9.34), (11.9.39), and (11.9.41) gives 
7 > a) 

t ~ t 

V ~ i + 7 — aR{t) ~ t + 7 — aR{t) ~ t + 7 


i - R{t) y /2 

W) ) 



(with 


z 


1/2 


(11.9.44) 
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(B) Eventually we have 


ka 2 ^ 


at a time t 1 given by (11.9.25) as 




2 V* 


[* - J(i« 2 ) 3/2 ] 


Now (11.9.34), (11.9.39), and (11.9.41) give 

t ~ —ka 3 In I 1 — :i:i - I + constant 
ka 2 


t' ~ t — ka In 1 


+ constant 


L 

f*i - — 1 

[i - — 1 

L K(<)J 

^ 2 ( 1_ S) lQCexp (i) 


~ —2 ka 3 In | 1 — - — I + constant 
R(t) 


(11.9.45) 


(11.9.46) 


Putting (A) and (B) together, we conclude that the red shift z seen by an observer 
at r' is zero when the collapse is observed to begin, increases gradually but remains 
of order a\/k ^ 1 until a time very close to T = n/2 yjk has passed, and then 
grows exponentially with a rate \j2ka 3 . For example, a collapsing sphere with a 
mass M = 10 8 M o and radius a = 100 light years will have a red shift z of order 
10“ 3 for a period of order 10 5 years, after which the red shift suddenly begins 
growing exponentially with an e-folding time of order 1 min. For practical pur- 
poses, the collapsing sphere is suddenly and completely cut off from communication 
with the rest of the universe. 

Completely cut off? Even if a collapsing body does fade out of sight, it still 
has a gravitational field, and, as shown in Section 7.6, the measurement of this 
field at great distances can be used to determine the energy, momentum, and 
angular momentum of the body. If the body has a net electric charge, then 
measurement of the electric field at great distances will, via Gauss’s theorem, also 
tell us the charge. It is interesting to ask whether measurements of the gravitational 
and/or electromagnetic fields outside a collapsing body can yield any information 
about the body beyond the energy, momentum, angular momentum, and charge. 
In the case of a spherically symmetric electrically neutral body, which we have 
been considering in this chapter, the answer is provided by Birkhoff’s theorem: 
The gravitational field outside a spherically symmetric body must be of the 
Schwarzschild form, so all we can ever learn about the body is its mass. (Spherical 
symmetry, of course, implies zero momentum and zero angular momentum.) 
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Carter J '’ has shown that when the gravitational field of an axially symmetric 
collapsing body settles down to a stationary state, its exterior metric belongs to a 
two-parameter family of solutions, such as the Kerr metrics (see Section 11.7) that 
are completely specified by the total mass and angular momentum. It is widely 
believed that the gravitational field of any electrically neutral collapsing body will 
eventually approach the Kerr form. 

As mentioned in the introduction to this chapter, interest in the phenomenon 
of gravitational collapse was rekindled in the last decade by the discovery of 
quasi-stellar sources, which appear to require some powerful new source of energy. 
The maximum energy available from fusion of hydrogen into the most stable 
nuclei, say iron, is only 8 MeV per nucleon, or less than 1% of the rest-mass. 
Matter-antimatter annihilation could have 100% efficiency (apart from neutrino 
energy losses), but this process can be important only if there is some abundant 
natural source of antinucleons. Otherwise the only likely mechanism for con- 
version of mass into energy with high efficiency is gravitational collapse. 34 

A cloud of dust that is collapsing as in the Oppenheimer-Snyder model will 
obviously release no energy to the outside world. To extract the growing kinetic 
energy of the falling particles, something must slow them on the way down — 
either a macroscopic “bounce” of the whole system, or particle-particle collisions 
that heat the collapsing gas. Detailed calculations reveal a discouragingly low 
efficiency for conversion of mass into available energy in the gravitational collapse 
of an isolated body. 3 5 However, particles falling into a Kerr metric can reemerge 
with a higher energy, acquired at the expense of the rotational energy of the 
collapsing body. 36 

Whether or not gravitational collapse has anything to do with quasi-stellar 
sources, the question remains : What happens to a real cooling star whose mass is 
above the Chandrasekhar and Oppenheimer-Volkoff limits? In recent years 
topological methods have been used by Penrose and Hawking to prove a number 
of powerful theorems, 37 to the effect that under reasonable conditions (validity 
of general relativity, positivity of energy, ubiquity of matter, causality) collapse 
becomes inevitable once a trapped surface forms. A trapped surface is a closed 
spacelike two-dimensional surface for which both the outgoing and the ingoing 
families of future-directed null geodesics orthogonal to the surface are converging. 
(For the Schwarzschild metric, the spheres with r and t constant are trapped 
surfaces for r within the Schwarzschild radius 2 MO.) However, it is not known 
whether a real massive star will actually develop a trapped surface, or merely 
explode into fragments with small enough mass to form stable neutron stars or 
white dwarfs. 

If gravitational collapse is indeed the inevitable fate of massive bodies, then 
we must expect that the universe is full of black holes , collapsing bodies whose 
presence is betrayed only through their gravitational fields or through the energy 
released when matter is drawn in. 38 Our best hope of observing gravitational 
collapse would be to find a binary star, one member an ordinary visible star, and 
the other member a black hole. 39 
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PART FOUR 
FORMAL DEVELOPMENTS 




“You could validly argue 
that the minimum 
formulation is neat, but 
really no better than the 
other formulation. However 
move from this lecture 
room to your bathtub and 
observe your big toe in the 
water. Your limbs no longer 
appear straight because the 
velocity of light in the 
water differs from that in 
air. The least-time principle 
tells you how to formulate 
behaviour under such 
conditions and the 
memorizing of Snell’s law 
about angles does not. Who 
can doubt which is the 
better scientific 
explanation?” Paul E, 

Samuelson, Maximum 
Principles in Analytical 
Economics, Nobel Memorial 
Lecture , December 11, 1970 

12 THE ACTION PRINCIPLE 


There are a great many physical systems whose dynamic equations can be 
derived from a “principle of least action,” that is, from a statement that some 
functional of the dynamical variables, the “action,” is stationary with respect to 
small variations of these variables. This formulation of the dynamic equations 
has one great advantage: It allows us to establish an immediate connection 
between symmetry principles and conservation laws. 

The symmetry of the action that concerns us most in this book is general 
covariance. In this section we shall develop a general definition of the energy - 
momentum tensor for any material system, as a functional derivative of the action 
for that system. The use of the action principle and general covariance will then 
allow us to show that this tensor is indeed conserved. 

To achieve a truly general formulation of general relativity in terms of an 
action principle, it is necessary to uncover a question that has been carefully 
buried until now: How can we incorporate the effects of gravitation into the field 
theories of particles with half-integer spin ? The answer requires the development 
of an approach to general relativity, the “tetrad formalism,” based directly on the 
families of locally inertial frames that were our starting point in Chapter 3. 
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Although the proof is more complicated, the energy-momentum tensor in this 
formalism is still symmetric and conserved. 


1 The Matter Action: An Example 

We shall begin by displaying one example of a physical system whose equations 
of motion can be derived from a principle of stationary action. The system is a 
collisionless plasma, consisting of particles n of mass m n and charge e n , together 
with the electromagnetic field F (x) they produce. The equations of motion in an 


arbitrary external gravitational field g^ v are 



+ r" (* ) dx " dx ' - 

dx n Vi " dr„ dr n 

/ p \ /7 nr v 

W dr„ 

(12.1.1) 

dr„ = {-g^dxf 

dx „ v ) 1/2 

(12.1.2) 

A [Vtfij **’(*)] = - £e„ 

OX ^ n 

[ 5 4 (x - x„) d ^- dr„ 

J dr„ 

(12.1.3) 

h F ^ {x) + h F ^ x) + 

A = o 

dx 11 

(12.1.4) 


[See Eqs. (5.2.9), (5.1.11), (5.2.6), (5.2.13), (5.2.7).] In order to satisfy (12.1.4), 
we introduce a vector potential A (x ) : 




8A v (x) 
dx M 


8A„(x) 

dx v 


(12.1.5) 


The independent dynamical variables may then be taken as A^x) and x„ v (p), 
where p is some quantity that simultaneously parameterizes all the space-time 
trajectories of the various particles. 

We tentatively take the action for this system as 


M 


I> n 


dp 


~9 u ,(x„(p)) 


dx/(p) dx 
dp 


si' 

dp 


1/2 


-W^xg^^F^F^x) 
v, f” , dx„“(p) . , . ,, 

+ Z e » d P — A„{x„{p)) 

» J-» d P 


( 12 . 1 . 6 ) 


(The subscript M is to remind us that this is the action only for matter and 
radiation, with g^'v 0 ) laken as a prescribed external gravitational field.) It is 
understood here that F ^ is given by Eq. (12.1.5), and the indices on F* v are 
raised with the contravariant metric tensor as usual. 
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The principle of stationary action says that the action 1 M will not be changed 
by an infinitesimal variation in the dynamical variables 

x^(p) -> 3p(p) 4- 5x^(p) 

A„{x) -* A^x) + 8 AJx) 

where 

5x^(p) -> 0 for |p | -> oo 
dAJpc) -> 0 for \x x \ -*■ oo 

if and only if x li {p) and A^x) obey the dynamical equations (12.1 .1)— (12.1 .3). To 
check that this is correct, let us compute the change in I M produced by this 
variation, without yet assuming that (12.1 .1)— (12.1 .3) are satisfied. We find that 


= i E m * 


dp 
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It is convenient at this point to change variables of integration from p to the x n 
defined by Eq. (12.1.2). This gives 


SI M = i I m„ 


fan 


dx* ddx n k dgMx n ) dx/ dx n ' 
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The condition that &t4(rj and 5A (x) vanish on the boundaries of the region of 
integration allows us to integrate by parts, and we obtain 
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dx M 


H 


+ Sx. 1 

dx„ j 

dr„5\x - x„) ^4 «,(*) 

dx„ 
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Evidently SI M vanishes for general variations dxfi and dA v if and only if x n A and 
A v obey the dynamical equations (12.1.1) and (12.1.3), and we therefore conclude 
that (12.1.6) does qualify as a suitable action for this system. 


2 General Definition of 


We are going to define the energy-momentum tensor for a material system 
described by an action I M as the “functional derivative” of I M with respect to 
g . That is, we imagine g^ v {x) to be subject to an infinitesimal variation 


SVv (12.2.1) 

where Sg uv is arbitrary, except that it is required to vanish as \x x \ -► oo. The action 
I M will not be stationary with respect to this variation, because for the moment 
we are regarding g^fix) not as a dynamical variable like x or A ^ but as an external 
field. Rather, dI M will be some linear functional of the infinitesimal Sg^ v (x), and 
therefore takes the form 


SIm 


1 

2 


d 4 x \] g{x) T llv {x)Sg flv {x) 


( 12 . 2 . 2 ) 


The coefficient T fiV (x) is defined to be the energy-momentum tensor of this system. 

A general proof that T^ v is a conserved symmetric tensor will be given in the 
next section. However, let us first check that (12.2.2) gives the correct energy- 
momentum tensor for the collisionless plasma described by the action (12.1.6). 
We are varying g ^ with A ^ held fixed, so 

M*’ = F.ArVl = F^Sg" + 

To calculate dg V(T , we note that 

o = = trsg u + g u dtr 

so 

dg v ° = -fY’Sg,' 

and therefore 

= -F^Sg, u + F'y*5g iK 
Also, we showed in Section 4.7 that 


Sg = gg iK 5g lK 

A straightforward calculation gives then 


81* = 


= ±I>„ J 


dp 


g U v Mp)) 


dp dp 


W jxSMdxSip) 

dp dp 


8g»x(z„(p)) 


i \d*xg' l2 (x){F J-(x)F* K (x) - tf'WF^F^xmjx) 



2 General Definition of 7 7 * 1 


361 


This is of the form (12.2.2), with 

T u (x) = <T 1/2 (*)2>„ 


00 dr x dr k 

dr„ ^ d\x - x n ) 


o dr „ 

+ Ffi(x)F»«(x) - ig^F^F^x) 


in agreement with the previously obtained energy-momentum tensor, given by 
Eqs. (5.3.5) and (5.3.7). 

The definition (12.2.2) is closely analogous to a similar definition of the electric 
current J 7 *. We can break up the total matter action into a purely electromagnetic 
term I E and another term I f M that describes the charged particles and their electro- 
magnetic interactions 

I M — Ie + Im (12.2.3) 

I E = -i | d*xg 1 ' 2 (x)F llv (x)F^(x) (12.2.4) 

Consider the effect on I' M of an infinitesimal variation in the vector potential 


Ap -> + SAp 


(12.2.5) 


Since I' M is not the whole action, the change in I M due to this variation in A 
does not vanish, but it is necessarily a linear functional of dA^. 


SI' U = fi 4 x^'g(x)J^x)dA ll (x) 


( 12 . 2 . 6 ) 


and the coefficient J*{x) is defined to be the electromagnetic current of the system. 
For instance, for the collisionless plasma described by Eq. (12.1.6), the term I' M 
is given as the sum of the first and third terms in Eq. (12.1.6), and we immediately 
find that 


SI’m = Z e - I dxfSA^xJ 

” J -00 

This is of the form (12.2.6), with 

J“(x) = g~ 1,2 (x) 2 e„ j* 


S 4 (x - x„) dxf 


in agreement with Eq. (5.2.13). The proof that (12.2.6) always yields a conserved 
current J"(x) is given in the next section. 


3 General Covariance and Energy-Momentum Conservation 

If the action I M for a material system is a scalar, then the statement that 
SI M vanishes is generally covariant, and so also are the dynamical equations 
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derived from this statement. For instance, a glance at the action (12.1.6) for a 
collisionless plasma shows that this I M is a scalar, and this ensures the general 
covariance of the dynamical equations (12.1.1)-(12.1.3) that follow from (12.1.6) 
upon application of the principle of stationary action. 

We shall therefore assume that I M is a scalar. This means that I M will be 
unchanged if we simultaneously make the replacements 

d 4 x — ► d 4 x 

d ) d 
dx* dx'* 

X /(P) x' n *{p) 

A*(x) - = A v (x) 

dx* 


, , dx p dx a 

?#.vW 9 P A X ) = 9 P *( X ) — 

dx * dx 

However, x ,fl is a mere variable of integration (as opposed to x /, which is a 
dynamical variable) so we can change x ffl back to x * without changing I M . We 
conclude then that I M is unchanged by the replacements 

X nW) X 'n(V) 

dx v 

^„( x ) -* A'^x) = A v (x) — - \_A' u (x') - A'^x)] 


g^( x ) 


?*»(*) = g P A x ) 


dx” dx ’ 
5*'" Sx' v 


[#.,(*') - gU x )] 


with d 4 x and djdx ,L now left alone. If the original transformation x 11 —*■ x' 11 was 
infinitesimal 

x'* = x* + £*(x) 


then the change in the dynamical variables is 
Sx n »(p) = £"(*») 

dx* dx 


ds x {x) 


Sg„A x ) = -g„A x ) - g*J x ) ^ A x ) 


8e x ( x ) _ 

dx' 


(12.3.1) 


(This change in A or g is just the Lie derivative; see Section 10.9.) The important 
point is that this is now an infinitesimal transformation of the dynamical variables 
alone, not of the coordinates over which we integrate, so the principle of stationary 
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action tells us that when the dynamical equations for x fi, A , and so on, are 
satisfied the change in these quantities produces no change in the matter action 
I M . The only change in I M comes from the variation in the external field g^ v , and 
(12.2.2) gives for this change 



If Ih is a scalar, then this must vanish ; integrating by parts gives then 

and since e*(x) is arbitrary 


0 = — ( yjg T\) vV T“' 

dx'' * 2\dx k ) 


or, recalling (4.7.6): 


0 - {T\) h 


(12.3.2) 


Thus the energy -momentum tensor defined by Eg. (12.2.2) is conserved (in the covariant 
sense) if and only if the matter action is a scalar. Also, with I M a scalar, (12.2.2) 
shows immediately that T^ v is a symmetric tensor , so this definition of the energy - 
momentum tensor has all the properties for which one could wish. 

This proof, that general covariance implies energy-momentum conservation, 
has an exact analog in the proof that gauge invariance implies charge conservation. 
The change in the action I' M defined by (12.2.3) caused by an arbitrary gauge 
transformation can arise only from the change in 4^, since I’m is stationary with 
respect to all other dynamical variables. A general infinitesimal gauge transforma- 
tion s will produce in A ^ the change (see Sections 4.11 and 10.2) 


SA 




ds 
dx fl 


Using this in (12.2.6), we see that I' M is gauge invariant if and only if 



Integrating by parts gives 


or, since g is arbitrary 


* 2 _ 

0 = d 4 x s — ( V g J M ) 
J dx 4 


0= 

-Jg ax 


We see again how closely analogous are gauge invariance and general covariance. 
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4 The Gravitational Action 

So far, in this chapter the gravitational field g ^ has been an external field 
that could be prescribed at will. (Indeed, (12.2.2) usually provides the most 
convenient definition of the energy-momentum tensor even in the absence of 
gravitation.) We will now give g ^ field equations of its own, by adding to the 
total action I a purely gravitational term I G 

I = I M + I G (12.4.1) 

I a = - At; f '/s'M R ( x ) d * x (12.4.2) 

lmu J 

Clearly I G is a scalar, so this would be a good candidate for a theory of gravitation 
even if we had no experience with gravitational phenomena. We shall now show 
that the application to I of the principle of stationary action does in fact yield 
Einstein’s theory. 

The curvature scalar R is defined as g tlv R flv , so a variation <5gr in the metric 
produces in the integrand of (12.4.2) a change 

5{^IgR) = sJgR^dg^ + RS sf g + \Jgg» v 8R^ 

According to Eq. (10.9.2), the change in the Ricci tensor is 

= (<sqj ;v - («■;,)„ 

the covariant derivatives being defined as if <5T^ V were a tensor (as, in fact, it is). 
Thus the last term in <5( \jg R) is 

3g g^SR^ = - (g^ST^.J 

or, using (4.7.7), 

A tf'SB,, = ~ ( sfg g^sr^) - ~ ( A s rsrl) 

This term therefore drops out when we integrate over all space. Also, 

<5 A = i Jg <7" v <$5V V” = -g^Sg^ 

so the change in the action (12.4.2) is 

S1g = At, f A d 4 x (12.4.3) 

IQnG J 

Combining (12.4.3) with (12.2.2), we see that the total action / is stationary with 
respect to arbitrary variations in g if and only if 

W v ^ gflvR + SnQT , v = Q 

which, of course, is the Einstein field equation. 
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[As another application of (12.4.3), we may use it to derive the contracted 
Bianchi identity. Since I G is a scalar it must be stationary with respect to the 
variation (12.3.1) in g^. Repeating the reasoning that led before to Eq. (12.3.2), 
we now find that 

- tfyq., = 0 

which we recognize as the contracted Bianchi identity (6.8.3).] 

This formalism suggests that Einstein’s theory might be modified by adding 
to R in Eq. (12.4.2) terms proportional to R 2 , R 3 , etc. As discussed in Section 7.1, 
such terms would only show up on a sufficiently small space- time scale. 


5 The Tetrad Formalism* 


Until now, we have followed only one approach in determining the effects of 
gravitation on general physical systems. Given the special-relativistic equations 
that govern the system in the absence of gravitation, we replace all Lorentz 
tensors with objects that behave like tensors (or tensor densities) under 
general coordinate transformations. Also, we replace all derivatives djdx a with 
co variant derivatives, and replace everywhere with g . The equations of 
motion are then generally co variant. (See Chapter 5.) 

This method actually works only for objects that behave like tensors under 
Lorentz transformation, and not for the spinor fields discussed in Section 2.12. 
(Mathematically, this is because the tensor representations of the group GL(4) of 
general linear 4x4 matrices behave like tensors under the subgroup of Lorentz 
transformations, but there are no representations of GL(4), or even ‘ ‘representa- 
tions up to a sign,” which behave like spinors under the Lorentz subgroup.) How 
then can we incorporate spinors into general relativity ? 

The answer lies in a different approach to the problem of determining the 
effects of gravitation on physical systems, an approach that is rather interesting 
in its own right, quite apart from the problem of dealing with spinors. 

To start, let us take advantage of the Principle of Equivalence, and at every 
point X erect a set of coordinates that are locally inertial at A. (Of course, it 
will not be possible to erect any single coordinate system that is locally inertial 
everywhere, unless the space -time continuum is “flat.”) As shown in Sections 3.2 
and 3.3, the metric in any general noninertial coordinate system is then 


where 




(12.5.1) 


V\(X) 


aw 


(12.5.2) 


* This section lies somewhat out of the book’s main line of development, and may be omitted in a first 
reading. 
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Note that we fix the locally inertial coordinates once and for all at each physical 
point X , so when we change our general noninertial coordinates from xP to x ,(1 , the 
partial derivatives F a ^ change according to the rule 

r)r v 

V* F'“ = — V\ (12.5.3) 

/ 1 t 1 5 fu v v ’ 


Thus, we are to think of V a ^ as forming four covariant vector fields, not as a single 
tensor : This set of four vectors is known as a tetrad , or vierbein. 

Now, given any contravariant vector field A^(x), we can use the tetrad to 
refer its components at x to the coordinate system £ x a locally inertial at x : 


*A a = V a A fl 


(12.5.4) 


We are contracting a contravariant vector A M with four co variant vectors F a M , so 
this has the effect of replacing the single four- vector A ** with the four scalars 
*A a . We can do the same with covariant vector fields, and indeed with general 
tensor fields : 


= V/A^ 

*B% = etc. 


(12.5.5) 


Here Vp v is just the tetrad (12.5.2), hut with a-index lowered with the Minkowski 
tensor and p-index raised with the metric tensor : 

V = (12.5.6) 

Note that according to Eq. (12.5.1) this is just the inverse of the tetrad 

s\ = F/F\ (12.5.7) 

and hence also 

<5% = V%F/ (12.5.8) 

The scalar components of the metric tensor are then simply 

= r/VfX, = n* f (12.5.9) 

Now that we have shown how to make any tensor field into a set of scalars, 
we can forget the original tensors V u , T , and so on, with which we started, and 
consider how we would construct an action if we worked from the beginning with 
the scalars *V CC , *T a/S , and so on. In this way, a spinor field, like Dirac’s electron 
field, can be brought into our formalism in precisely the same way as any other 
field, and its peculiar Lorentz transformation properties cause no particular 
trouble. There are now two invariance principles which must be met in con- 
structing a suitable matter action Im- 

(A) The action must be generally covariant, with all fields treated as scalars, 
except for the tetrad itself. 
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(B) The Principle of Equivalence requires that special relativity should apply 
in locally inertial frames, and in particular, that it should make no difference which 
locally inertial frame we choose at each point. Thus since our scalar field com- 
ponents *V X , *T x p, and so on, are defined with respect to an arbitrarily chosen 
locally inertial coordinate system, the field equations and the action must be 
invariant with respect to a redefinition of these locally inertial coordinate systems 
at each point, or in other words, with respect to Lorentz transformations \ a p(x) 
that can depend on position in space-time : 

*A a (x) -> A a p (x)*A p (x) 

* T a 0 (x) K 7 (x)A/(x)*T yd (x), etc. 

where 

= r\ yS (12.5.10) 

The tetrad (12.5.2) changes according to the same rule as *A a : 

V%(x) - A»F',(*) (12.5.11) 

and in general an arbitrary field *^/ n (x) will change according to the rule : 

*«*) - I (12.5.12) 

m 

where D(A) is a matrix representation of the Lorentz group (or at least of the 
infinitesimal Lorentz group), of the sort discussed in Section 2.12. 

These two invariance principles lead to a dual classification of physical 
quantities. A coordinate scalar or coordinate tensor transforms as a scalar or a 
tensor under changes in the coordinate system. A Lorentz scalar or Lorentz tensor 
or Lorentz spinor transforms according to a rule like (12.5.12), with D( A) the 
identity or a tensor representation or a spinor representation of the infinitesimal 
Lorentz group, under changes in the choice of the locally inertial coordinate frame. 
For instance, a field such as (12.5.4) is a coordinate scalar and a Lorentz vector, 
the Dirac field of the electron is a coordinate scalar and a Lorentz spinor, and the 
tetrad F 0 ^ is a coordinate vector and a Lorentz vector. To be physically acceptable, 
the matter action I M must be both a coordinate scalar and a Lorentz scalar. 

At this point, the reader may be becoming uneasy. How is the gravitational 
held going to get into this sort of theory, when the coordinate-scalar components 
(12.5.9) of the metric tensor are just the constants ? The answer is that gravita- 
tional tensor fields appear in the action because, and only because, of the necessity 
of introducing derivatives into the theory. If it made sense to construct an action 
I M solely from fields, and not their derivatives, then it would only be necessan' to 
choose some arbitrary Lorentz-invariant function F^{*^/{x)) of various fields 
*\]/ n (x) (but not the tetrad), call them all coordinate scalars, and take the action as 
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This would then automatically be a coordinate scalar and a Lorentz scalar. 
However, the examples discussed in the previous sections of this chapter show that 
any physically sensible action must involve derivatives of physical quantities as 
well as the quantities themselves. The tetrad field must enter into the action in 
such a way as to keep it a coordinate scalar and a Lorentz scalar despite the 
presence of derivatives. 

An ordinary derivative is of course a coordinate vector, in the sense that under 
a coordinate transformation a? — > it transforms according to the rule 

d > d = dx v d 
dx** dx ffl dx'* 1 dx v 


If all the fields appearing in the action were coordinate scalars, there would be no 
contravariant indices with which to contract the co variant index p; hence, in 
order to make the action a coordinate scalar, it is necessary to introduce the 
tetrad field, and incorporate derivatives into the action in the form 



(12.5.13) 


However, although this is a coordinate scalar, it does not have simple transforma- 
tion properties under position-dependent Lorentz transformations. Acting on a 
general field *\jj with the Lorentz transformation rule (12.5.12), the coordinate- 
scalar derivative has the transformation rule 


V/(x) A *<K*) - A/WVW A {D(\(x))*>l>(x)} 

dx ^ dx v 

= A/(£)F/(x) |z>(A(k)) ^ D(\(x)) 

(12.5.14) 

However, what we need is to incorporate derivatives into the action in the 
form of an operator Q) a that is not only a coordinate scalar, but also, unlike 
(12.5.13), is a Lorentz vector, in the sense that for a position-dependent Lorentz 
transformation A a fi (x), 

-► Kj l (x)D(K(x))Si*\jj(x) (12.5.15) 



Any action, which depends only on various fields *\j/ and on their “derivatives” 
/q will then automatically be independent of the choice of locally inertial 
frames if it is invariant under ordinary constant Lorentz transformations. 
Inspection of Eq. (12.5.14) shows that we can construct a coordinate-scalar 
Lorentz vector derivative 1 of the form 


' d 

— + 


r 




Q) = V » 

cl — r a 


(12.5.16) 
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where I (l is a matrix with the Lorentz transformation rule 
r» - D(A(x))T tl (x)D- , (A(x)) 

D(A(z))J.D -, (A(a;)) (12.5.17) 

The inhomogeneous term in (12.5.17) will then cancel the second term in (12.5.14), 
giving 3 a the desired transformation property (12.5.15). 

In order to determine the structure of the matrix r M (:r), it will be sufficient to 
consider Lorentz transformations that are infinitesimally close to the identity. 
Such transformations must be of the form (2.12.5), (2.12.6) : 


A a p {x) = 3% + co a p {x) 

(12.5.18) 

with 


<M*) = -to,, Ax) 

(12.5.19) 

In this case, the matrix D has the form (2.12.7): 


D(1 + co(x)) = 1 + j(O ap {x)G acp 

(12.5.20) 


where cr ajg are a set of constant matrices that are antisymmetric in a and /? 

and that satisfy the commutation relations (2.12.12): 

[<V °yd\ = *lylp0 - riy^fS + IspCy* ~ 1sofly0 (12.5.22) 

The condition (12.5.17) tells us that under the infinitesimal Lorentz transformation 
(12.5.18), the matrix T (#) transforms according to 

r„(*) -» r„(*) + |m“' J (*)[(7^, T^xy] 

~ iP* ~ <o«(z) (12.5.23) 

duct 1 

Note that V*^(x) transforms into 

F*» -> F“ v (z) + co‘ f {x) V^ix) 

and therefore, using (12.5.8) 



V<*> ~ v xv (x) - v/(x) A v (x) 

dx v OX v 


+ a> f '(x)V 7 '(x) A V„(x) + co^x)V/{x) A VJx) 


_8_ 

dx * 
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Multiplying this with and using the commutation rules (12.5.22), we find that 
the transformation condition (12.5.23) is satisfied by the matrix 

r„(z) = F„.„ (12.5.24) 


To summarize : The effects of gravitation on any physical system can be taken into 
account by writing down the matter action or the field equations that hold in 
special relativity, and then replacing all derivatives d/dx* with the “covariant” 
derivatives 

= V/ A- + V'jy F/ V ?V . H (12.5.25) 


This prescription yields a matter action or field equations that are invariant under 
general coordinate transformations, with V x " regarded as a four- vector and with 
all other fields regarded as scalars, and that also do not depend on how we choose 
locally inertial frames in defining the tetrad. 

How are we to define the energy- momentum tensor in this formalism? The 
variation S V J 1 i n the tetrad field will produce in the matter action a change 


Vm 


d**yjg D\5V* 


(12.5.26) 


where is a coordinate vector and a Lorentz vector. Let us tentatively define 
the energy-momentum tensor by 


T„ = V au U\ (12.5.27) 

As required, this is manifestly a coordinate tensor and a Lorentz scalar. To verify 
that (12.5.27) is a suitable energy-momentum tensor, we must also check that it is 
symmetric 

7,. = (12.5.28) 

and that it is conserved 

{T\). v = 0 (12.5.29) 

The symmetry property of the energy-momentum tensor is not at all obvious 
in the tetrad formalism, but must be derived from the invariance of the matter 
action under the infinitesimal Lorentz transformations 

AV*) = <5% + “V*) 

with 

|co“ /t (a:)| <$ 1 

These transformations will produce small changes in all the dynamic variables, 
but the matter action I M is supposed to be stationary with respect to variations in 
each of these variables except the tetrad, which enters in I M as an external field. 
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Hence we only need to take account of the change (1 54.5. II) in the tetrad field 

SV/(x) = (12.5.30) 

Using (12.5.30) in (12.5.26), we find that the invariance of the matter action under 
position-dependent Lorentz transformations requires that 

0 = (* d 4 x V g(x) U a tl (x) V^{x)(D af} (x) 

But co a p(x) is arbitrary, except for the antisymmetry condition (12.5.19), so the 
coefficient of co afi (x) must be symmetric: 

77 * yPv — TJP vw 

M ti 

Multiplying this equation with V fiv V aJi , and using (12.5.7), we find that 

U\V al = U' x V, y (12.5.31) 


which is the same as the expected symmetry condition (12.5.28). 

To show that the energy- momentum tensor defined by Eq. (12.5.27) is con- 
served, we must use the invariance of the matter action under infinitesimal 
coordinate transformations : 

x ,ft = x* 1 + e^x) 


with | | very small. Such transformations alter the tetrad field by an infinitesimal 
amount 


SV/(x) = V'»(x) - V/(x) 


F. Y (*) 


de K (x) 

dx v 


dV/(x) , 


dx k 


£ (x) 


(12.5.32) 


[Compare itq. (12.3.1).] Also, all other coordinate-scalar fields \jj(x) change by the 
amounts 

6<l>(x) = ijj'(x) - <j,[x) = e^x) 

dx A 

but again, the matter action I M is stationary with respect to variations in these 
fields, so it is only the variation in the tetrad field that matters here. Using (12.5.32) 
in (12.5.26), we find that the invariance of the matter action I M under general 
coordinate transformations requires that 



But s k {x) is arbitrary, so after integrating by parts we can set the coefficient of 
£*(x) equal to zero, and find 

0 = /- v (VUW) + 7gU\^Fl 

dx v dx A 
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Using Eqs. (12.5.27) and (12.5.8) let us write this as 




dx v 


According to (12.5.1), the metric tensor is related to the tetrad by 


and therefore 


y »y c 


da ^ dV * dV 1 

_ yctv _|_ y«fi ^ r <x 

dx x dx x dx x 


Since T^ v is symmetric, Eq. (12.5.33) may now be written 

But Eqs. (3.3.1) and (4.7.6) give 

da» v da 

— _ n P^n €tv - } P n _ _ n av TP — r/PPT v 

a? - 9 9 I**- 9 9 r " A 


(12.5.33) 


(12.5.34) 


d_ 

dx v 


In yjg = 


so (12.5.34) is the same as the usual conservation law (12.5.29). 

Our definition (12.5.27) of the energy- momentum tensor is thus completely 
satisfactory. Note, however, that if the matter action were not invariant under 
position-dependent Lorentz transformations, then not only would T^ v not be 
symmetric, but in consequence, it also would not be conserved. 

The total action for matter and gravitation is again of the form 


I =Ih+I g 

with I G of the form (12.4.2). A variation in the tetrad will produce in the metric a 
change given by (12.5.1) as 

= rvr„ + v\sv v 
= + g„y\W. 1 

so (12.4.3) gives the change in the gravitational action as 

A - i<5V?]rysi7 d*x (12.5.35) 

The total action must be stationary with respect to variations in the tetrad, so 
(12.5.26) and (12.5.35) yield the field equation 


SI G = 


1 

8 nG 


(R'x - = SnGV\ 
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Contracting this with P av and using (12.5. 1) and (12.5.27) then yields the familiar 
Einstein field equation 

- i 9 V ^ = ~^GT vX (12.5.36) 


These equations serve to determine only g flY , leaving the tetrad determined only 
up to a Loren tz transformation (12.5.11). However, the invariance of the matter 
action under such position-dependent Lorentz transformations ensures that all 


tetrads associated with a given metric have the same physical effects. 
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“Symmetry, as wide or as 
narrow as you may define 
it, is one idea by which man 
through the ages has tried 
to comprehend and create 
order, beauty, and 
perfection.” Hermann Weyl , 
Symmetry 


13 SYMMETRIC 
SPACES 


Euclid implicitly assumed that metric relations are unaffected by translations 
or rotations. Real gravitational fields do not usually have such a high degree of 
symmetry, but they often admit some group of approximate symmetry trans- 
formations, and when they do, we can use this information to help solve the 
Einstein equations, or even to do without a solution. I shall give only a very 
brief introduction to the elaborate mathematical theory of symmetric spaces, 
with special attention to the maximally symmetric spaces that are of special 
interest in cosmology. 

The initial difficulty here is: How can we use some supposed symmetry of a 
metric space to gain information about the metric, when we need to know the 
metric before we can establish a coordinate system in which to define the 
symmetry ? In order to avoid this impasse, we shall have to learn ways of describ- 
ing symmetries in a covariant language, which does not depend on any particular 
choice of coordinate system. Once this language is established, it becomes a matter 
of mathematical manipulation to determine those properties of a metric that 
follow from its symmetries. 


1 Killing Vectors 

A metric g^ix) is said to be form-invariant under a given coordinate trans- 
formation x — > x r , when the transformed metric g'^Jx') is the same function of its 
argument x' 11 as the original metric g^x) was of its argument x that is. 

g'ltAy) = 9hM for all y (13.1.1) 
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[This is different from the condition for a scalar, which is that S\x') = $(#).] 
At any given point the transformed metric is given by the relation 


or equivalently 


, dx p dx a 

9 - (x) = ^^ g - (x) 


9 U v( x ) 


dx' p dx 


When (13.1.1) is valid, we can replace g p(T (x f ) with g po (x') and so obtain the 
fundamental requirement for a form invariance of the metric : 




dx’ p dx w 
dx p dx v 


9 p A x ’) 


(13.1.2) 


Any transformation x -> x f that satisfies (13.1.2) is called an isometry. 

In general, Eq. (13.1.2) is a very complicated restriction on the function 
x >p (x). It can be greatly simplified by descending to the special case of an in- 
finitesimal coordinate transformation : 


x ,p = x n _|_ e ^x) with |e| 1 

To first order in e, Eq. (13.1.2) now reads 


0 = 


dj^x) 

dx p 


m*) + 




(13.1.3) 


(13.1.4) 


This can be rewritten in terms of derivatives of the covariant components 

Z* = 

^*3 fi$ ^g p p 

dx p dx° 


or, more compactly, 


= K* 


+ 


dx p 

dx a 

_8x p 

Vi | 

II 


- 2 <L 

T%, 

dx p 

dx a 


' pv 


0 

ja. 

b 

II 

+ z, 


(13.1.5) 


Any four- vector field ^(x) that satisfies Eq. (13.1.5) will be said to form a Killing 
vector 1 of the metric g pv {x). The problem of determining all infinitesimal isometries 
of a given metric is now reduced to the problem of determining all Killing vectors 
of the metric. Any linear combination of Killing vectors (with constant coefficients) 
is a Killing vector, so it is the space of vector fields spanned by the Killing vectors 
that really determines the infinitesimal isometries of a metric. 

The Killing condition (13.1.5) is much more restrictive than it looks, for it 
allows us to determine the whole function £ p (x) from given values of and 
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at some point X . To see this, we need only recall our formula (6.5.1) for the 
commutator of two covariant derivatives, 

(13.1.6) 

and the cyclic sum rule (6.6.5) for the curvature tensor, 

+ Kp + K~ = 0 ( 13 - 1 . 7 ) 

By adding (13.1.6) and its two cyclic permutations, we find that any vector £ p 
must satisfy the relation 


0 = £ - £ + £ - £ + £ - £ 

'=<r;p;/i b<r;p;p 1 Sp;<r;p '=p;p;(r ' bp;p ;< t S 


>p;a;n 


For a Killing vector, (13.1.5) and (13.1.8) give 


(13.1.8) 


0 = - {. 


p;p;<r 


and thus (13.1.6) becomes 


= ~Kptx ( 13 - 1 . 9 ) 

Hence, given and £ X;v at some point X , we can determine the second derivatives 
of £ x (x) at X from Eq. (13.1.9), and we can find successively higher derivatives of 
£ x at X by taking derivatives of Eq. (13.1.9). All the derivatives of £ x at X will 
thus be determined as linear combinations of £ x {X) and £ x . v (X). The function £ x {x) 
can then (when it exists) be constructed as a Taylor series in x k — X k within 
some finite neighborhood of X , and will again be linear in the initial values £ x {X), 
£ x . v (X). Thus any particular Killing vector £ p n {x) of the metric g pv (x) can be 
expressed as 


£/(*) = *)£/(*) + X)Z l;v "(X) (13.1.10) 

where A p k and B p kv are functions that of course depend on the metric and on X, 
but do not depend on the initial values £ x {X) and £ x (X), and hence are the same 
for all Killing vectors. Each Killing vector £ p {x) of a given metric is uniquely 
specified by the values of £ p {X) and £ p . a (X) at any particular point X . 

A set of Killing vectors £ p n {x) is said to be independent if they do not satisfy 
any linear relations of the form 


2 C„{/(Z) = 0 (13.1.11) 

n 

with constant coefficients c n . Equation (13.1.10) tells us that there can be at most 
N(N + l)/2 independent Killing vectors in N dimensions. For consider any M 
Killing vectors £ p n (x). For each n, there are N quantities £ p n (X) and N(N — 1 )/2 
independent quantities £*. V (X) [recall Eq. (13.1.5)], so we can think of the quan- 
tities £ p H (X) and £ n p;v (X) as the components of M vectors in an N(N + l)/2 
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dimensional space. If M > N(N + l)/2, then these M vectors cannot be linearly 
independent, so they must satisfy relations of the form 

2c.{/W = = o 


Equation (13.1.10) then tells us that the Killing vectors { p n (x ) satisfy the relations 
(13.1.11) everywhere, and are therefore not independent Killing vectors. 

This result is significant only because we defined independent Killing vectors 
as vectors that are not subject to any linear relations with constant coefficients. 
At some given point X in an A-dimensional space, any set of more than N Killing 
vectors will of course be subject to one or more linear relations such as (13.1.11). 
However, the coefficients c n in these linear relations need not be constant in x 
The above theorem says that any set of more than N(N + l)/2 Killing vectors 
will be subject to linear relations with constant coefficients. 

A metric space is said to be homogeneous if there exist infinitesimal isometries 
(13.1.3) that carry any given point X into any other point in its immediate 
neighborhood. That is, the metric must admit Killing vectors that at any given 
point take all possible values. In particular, in an A- dimensional space we can 
choose a set of N Killing vectors \x; X) with 

t> w (X-, X) = 5/ 


These are evidently independent, because any relation of the form c^y p \x ; X) — 0 
would at x = X imply that all c k vanish. 

A metric space is said to be isotropic about a given point X if there exist 
infinitesimal isometries (13.1.3) that leave the point X fixed, so that £ A ( X ) = 0, 
and for which the first derivatives f A . v (X) take all possible values, subject only to 
the antisymmetry condition (13.1.5). In particular, in N dimensions we can choose 
a set of N(N — l)/2 Killing vectors ^ k v \x; X) with 


{/'"V, X) = -£ i < v 'V. X) 

Zx <liv} {X; X) = 0 

Sxj^x- X) = ^ 8/5 ; - 5/5/ 

These are independent, because any relation of the form c v ^ v \x; X) = 0 with 

Sv = “S* wouid at x impty that C A P - c p a = = 0. 

We shall also have to deal with spaces that are isotropic about every point. 
In this case there are Killing vectors c ( / v) (aq X) and ^ k v \x\ X + dX ) that satisfy 
the above initial conditions at X and at X + dX, respectively. Any linear com- 
bination of these will be a Killing vector, and so d^ v) { x \ X)ldX p will also be a 
Killing vector of the metric. In order to evaluate this Killing vector at x = X 
we need only recall that ^5f iV) (A; X) vanishes, and therefore 


dX p 


i/'HX; X) = 




X) 


dx“ 


bt = X 




dx " 


iZf 

_x = X 
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This gives 


' d 

dx p 




\x = X 


- W + 5/5, 


It is now obvious that we can construct a Killing vector ^,(x) that takes any 
arbitrary value a, at x = X ; we need only take 




N - 1 dX p 




Hence any space that is isotropic about every point is also homogeneous. 

A metric that admits the maximum number N(N + l)/2 of Killing vectors is 
said to be maximally symmetric. In particular, a space that is both homogeneous 
and isotropic about some given point X will admit the N(N + l)/2 Killing vectors 
^\x; X) and </g y \x; X ). These Killing vectors are obviously independent, for if 
they satisfy a linear relation 

0 = X) + c pv (, ipv) (x; X) 

C HV ~ C vfi 

then differentiating with respect to x p and setting x = X gives c, p = 0, and 
setting x = X then gives c, = 0. Thus a homogeneous space that is isotropic about 
some point is maximally symmetric. It then also follows that any space that is 
isotropic about every point is maximally symmetric. 

We can also prove the converse, that a maximally symmetric space is necessarily 
homogeneous and isotropic about all points. If there are N(N + l)/2 independent 
Killing vectors £/(x), then we can think of the quantities £/(X), £ A;v n ( X ) as 
forming a square matrix, with N(N + l)/2 rows labeled by n, and N(N + l)/2 
columns labeled by the N values of p and the N(N — l)/2 values of X and v with 
X > v. Furthermore, this matrix must have a nonvanishing determinant, because 
any relation of the form 

£c„C/(X) = X<tf*/(*> = o 

n n 

would with (13.1.10) imply that c//(x) vanishes, contrary to our assumption 
that these Killing vectors are independent. It must therefore be possible, for any 
“row vector” with “components” a p and b pv = —b vp , to find a solution of the 
equations 

X d *tn"( x ) = % 

n 

X <*.£„„■(-*) = 

ft 

Hence we can find a Killing vector %/x) for which £JX) takes the value a p and 
<!; (X) takes the value b pv , by choosing 

£„(*) = X d £S( x ) 

n 
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But is arbitrary, so the space is homogeneous, and b is arbitrary (except that 
b — — b Vfl ), so the space is isotropic about X. 

As an example of a maximally symmetric space, consider an X-dimensional 
flat space, with vanishing curvature tensor. We can then choose Cartesian coor- 
dinates with a constant metric and vanishing affine connection. In this coordinate 
system, Eq. (13.1.9) reads 

= o 

dx p dx a 

The solution is 

<^0*0 = % + 

with and b^ v constant. This satisfies the Killing vector condition (13.1.5) if and 
only if 

b (iv = ~ b vp 

We can thus choose a set of N(N + l)/2 Killing vectors as follows: 

£”(*) = V 

^>(*) = <5„V - <5„V 

and the general Killing vector is 

{*(*) = + b v ^; x \x) 

The N vectors ^ v) { x ) represent translations, whereas the N(N — l)/2 vectors 
^(vA) re p reS ent infinitesimal rotations (or, for a Minkowski space, Lorentz trans- 
formations). Thus any flat metric admits N(N + l)/2 independent Killing vectors, 
and is therefore maximally symmetric. 

Of course, not all metrics admit the maximum number of Killing vectors. 
Whether (13.1.9) is soluble for a given set of initial data £ A (X), ^ (X) depends 
on the integrability of this equation, which in turn depends on the metric. One 
integrability condition we shall use below follows from the general formula for 
commutators of co variant derivatives of tensors: 


£ p;p;a;v ^p;p;v;a ^pav^A.;p ^ pa v^p;A 

Equation (13.1.9) will satisfy this condition if and only if 


or, using (13.1.5), 


KppZ».-,v - R iepZi:, + ( R ppn* ~ RX Wi M> 

~ — R piTv£i;p ~ R pt*vZp;X 




app 


5* + KpVMx* = [R X pp: - Kp-JZx (13.1.12) 


These conditions are of course empty for a flat space, but in general they will 
impose linear relations among the £ x and £ A;(C at any given point. Alternatively, if 
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we know something about the Killing vectors admitted by an unknown metric, 
then we can use (13.1.12) to learn something about its curvature tensor. In this 
way, we shall be able in the following sections to deduce the form of a maximally 
symmetric metric from its isometries. 

It should be emphasized that the existence of a definite number of independent 
Killing vectors does not depend on a particular choice of coordinate system. If 
^{x) is a Killing vector of a metric g^ v (x), then by performing a coordinate trans- 
formation x 11 x we obtain a metric 


gU x ') 


dx " 8x ’ 
dx'*dx’' 9l, ° (x) 


and, since (13.1.5) is generally covariant, this obviously has a Killing vector 

{"'(*') = ~ {» 
dx v 

If M Killing vectors ^ n (x) are independent, then so are the M Killing vectors 
£n n> ( x ')> f° r any linear relation among the would imply a linear relation among 
the Thus the maximal symmetry of a given space is an inner property, not 
depending on how we choose the coordinate system. In particular, it follows that 
any space with vanishing curvature tensor is maximally symmetric; the converse, 
however, is not true. It is also easy to see that the homogeneity or isotropy of a 
given space is independent of the choice of coordinates. As far as these simple 
symmetries are concerned, we have accomplished the task laid out in the intro- 
duction to this chapter, that of describing symmetries of the metric in a generally 
covariant language. 


2 Maximally Symmetric Spaces: Uniqueness 

We now show that the maximally symmetric spaces are uniquely specified by 
a ‘‘curvature constant” K, and by the numbers of eigenvalues of the metric that 
are positive or negative. That is, given two maximally symmetric metrics with the same 
K and the same numbers of eigenvalues of each sign , it will alivays be possible to find a 
coordinate transformation that carries one metric into the other. Armed with this 
theorem, we shall be able in the next section to carry out an exhaustive study of 
maximally symmetric spaces by simply constructing such metrics in one con- 
venient coordinate system. 

We showed in the last section that at any given point x in a maximally 
symmetric space, we can find Killing vectors for which £ k (x) vanishes and for 
which f A . K (ir) is an arbitrary antisymmetric matrix. It follows then that the co- 
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efficient of £ x . K (x) in Eq. (13.1.12) must have a vanishing antisymmetric part, 
that is, 

+ S L& - K& + K& 

= ~ + S *P» S i ( 13 - 2 . 1 ) 

We also showed that at any given point x in a maximally symmetric space, there 
exist Killing vectors for which ^ x {x) takes any values we like, so (13.1.12) and 
(13.2.1) require that 

Kr.* = Kw (13.2.2) 

We actually only need to use (13.2.1), because we have shown in the last section 
that a space that is isotropic about every point, and hence satisfies (13.2.1), must 
also be homogeneous, and hence must also satisfy (13.2.2). 

Our first step in the proof is to use Eq. (13.2.1) to derive a formula for the 
curvature tensor. Contracting k with p yields 

— NRpcy + R pav — Rep* + ^ipa = ~—p<xv + “ <rp"v ~ ^vp^a 

(Recall that vanishes, — R K apK is the Ricci tensor R ap , and in N dimensions, 
S* = N.) Using the cyclic sum rule (6.6.5) and the antisymmetry of R x pv , we find 

(N - l)B lpav = R vp g X(r - R ap g Xv (13.2.3) 

But this must be antisymmetric in X and p, so 

^vp9Xa ^ap9x\ ^vX9pa ^aX9pv 

Contracting X with v, we find 

R ap - NE ap = ~^x9ap + R pa 

The Ricci tensor thus takes the form 

R„ = J (13.2.4) 

Inserting this in (13.2.3) gives our formula for the curvature tensor 

R^ 

R Xp,v = N N {9y P 9x<r - (13.2.5) 

This formula satisfies (13.2.1), so there is nothing further to be learned from that 
condition. 

In a space that is isotropic about every point, Eqs. (13.2.4) and (13.2.5) will 
hold everywhere, and we can use the Bianchi identities to say something about 
the dependence of the curvature scalar R x x on position. Using (13.2.4) in (6.8.4), 
we have 

o = [jr, - = (I - 2) 
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or 


0 = 


1 

2 I dx a 


R ; 


(13.2.6) 


Hence any space of three or more dimensions, in which (13.2.4) holds everywhere, 
will have R x x constant. It is convenient to introduce a curvature constant K in 
place of R X X) with 

R x x = -N{N - 1 )K (13.2.7) 

Using this in (13.2.4) gives the Ricci tensor and the Riemann-Christoffel tensor 
here as 

K P = -W - 1 )Kg ap (13.2.8) 

= K {9« P 9xv ~ 9^xa) (13.2.9) 

In differential geometry a space with these properties is called a space of constant 
curvature. 

Incidentally, we showed in Section 6.7 that the curvature tensor in two 
dimensions is always of the form (13.2.5), so it is not surprising that in this case 
(13.2.6) does not allow us to draw any conclusions about the constancy of R x x . 
However, by using (13.2.2) one can show that the quantity K in (13.2.9) is also 
constant for maximally symmetric spaces of dimensionality N = 2. 

Now suppose that we are given two metrics g pv {x) and g'^x'), both having 
the same numbers of positive and negative eigenvalues, and both satisfying the 
condition (13.2.9) for a maximally symmetric space, that is, 

Rx P *v = K(9a P 0 Av - g,„9x a ) (13.2.10) 

*Vv = Kfoju - rtpffi.) (13.2.11) 

with the same curvature constant K. We shall show that g pv (x) and g'^x') must 
be equivalent, in the sense that there is a transformation x -* x' that converts 
g pv {x) into g^ix'), that is, for which 

= (13.2.12) 

dx p dx 


We shall prove this by actually constructing x fp (x) as a power series in x p . 
First, note that the equality in the numbers of positive and negative eigenvalues 
of g pv and g pv means that we can find a nonsingular matrix d p p for which 

g f ,Md%d\ = g pa ( 0) (13.2.13) 

(The argument here is the same as in Section 6.4.) Thus we can satisfy (13.2.12) 
to zero order in x with 

x ,p = d p p x p 
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Now we proceed by mathematical induction. Suppose that we succeed in satisfying 
(13.2.12) to order n — 1 in x p with a polynomial 

*"■(*) = <*>" + t ~ . ■ -.y 1 (13.2.14) 

m = 2 m ! 

We want to add a term of order n + 1 in x p so that (13.2.12) holds to order n. 
This condition will be satisfied if the derivative of (13.2.12) holds in order n — 1. 
that is, if 


d 2 x ,p dx ,v , , dV V dx ,fl , . „ 

9uvi x ) + r — Q'vW) 


dx p dx k dx a 11 


daf dx x dx p 


+ d y 'uv( x ') = d 9pA x ) 

dx p dx a dx x dx ,K dx x 

This will be satisfied if (and, in fact, only if) 

dV* dx ' V ' i 'S i i \ 

dx ,p dx ,v dx fK 


in order x r 


*, ^ g ^ w ^' } 


order x 


n~ 1 


This only needs to hold in order n — 1 in x p , so we can use (13.2.12), which was 
assumed to hold to this order, to convert it into an equivalent requirement 


dx p dx x dx K Xp dx p dx x VK 


in order x 


n- 1 


(13.2.15) 


We can use (13.2.14), which is correct to order x n , to calculate the term on the 
right-hand side of order x n ~ l . Let us write the result as 


[' 


dx ,p _ . x 8x iv dx ,K 
^ T ^ x) ~ 


i 


r£(*')"| 

Jorder n — 1 


(n - 1 )! 


'Xpert 


(13.2.16) 


the coefficients c{ p . . . depending in a complicated way on the functions 
and <7^ v (a:') and on the previously determined coefficients . . . pw . Then (13.2.15) 
will be satisfied in order w — 1 if we add to (13.2.14) a term 


[x' p (x)] 


order n + 1 


(n + 1)1 


'Xpert 


x x x p x (Tl • • -x an ' 


(13.2.17) 
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provided that the coefficient cj peri . . . <Tn _ 1 is totally symmetric in all its lower 
indices. These coefficients are obviously symmetric under interchange of X and p 
or among the o m indices, so the only condition that needs to he satisfied is that 
they are symmetric between X and any a m , or equivalently, that the derivative of 
(13.2.16) with respect to x a should be symmetric between X and < 7 : 


dx a 


dx dx y dx 


in order x n 2 
(13.2.18) 

Since (13.2.12) is assumed to hold to order x n ~ 1 , its derivative, Eq. (13.2.15), will 
hold to order x n ~ 2 , so we can use (13.2.12) and (13.2.15) to rewrite (13.2.18) as 
the equivalent requirement 


= tx)- 8 — d ~ 

dx x \ dx K ap dx p dx a 




fix'* 1 dx ,v dx rK ?)r ,a 

~~ — - R'^x') in order x n ~ 2 (13.2.19) 

dx K pXtl dx p dx x dx* VKa } 

Now for the first time we use Eqs. (13.2.10) and (13.2.11), which allow (13.2.19) 
to be replaced with the equivalent requirement 


in order x n 2 
(13.2.20) 

This condition is satisfied, because (13.2.12) was assumed to hold to order x n ~ l . 
To recapitulate, this implies that (13.2.19) holds in order x n ~ 2 , which implies that 
(13.2.18) holds in order x n ~ 2 , which implies that the coefficients cj . . are 
totally symmetric in their lower indices, which implies that (13.2.17) satisfies 
(13.2.15), which implies that by adding (13.2.17) to (13.2.14) we can satisfy (13.2.12) 
to order x n . Thus, if (13.2.12) can be satisfied to order x n ~ 1 by a polynomial x' (x) 
of order n, it can be satisfied to order x n by a polynomial x'(x) of order n + 1, 
and therefore a function x'(x) satisfying (13.2.12) exactly can be built up as a 
power series, as was to be proven. 


dx" 1 , , 

5 ^ 


dx ,p 


' P1 1 


(X) 


dx ,v 

dx p 


dx ,K dx ,p 


g' VK ( xt ) 


dx ,p dx ,a 
dx k dx * 


VIA*') 


3 Maximally Symmetric Spaces: Construction 

Maximally symmetric spaces are essentially unique, so we can learn all about 
them by constructing examples with arbitrary curvature K in any way we like. 
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This is one rather obvious way to carry out this construction. (See Figure 13.1.) 
Consider a flat (N + 1) -dimensional space, with metric given by 

— dx 2 = g AB dx A dx B — C^ v dx ** dx v + K -1 dz 2 (13.3.1) 

where C^ v is a constant N x N matrix and K is some constant. We can embed a 
non-Euclidean N-dimensional space in this larger space by restricting the variables 
x ** and z to the surface of a sphere (or pseudosphere) : 

KC^xW + z 2 = 1 (13.3.2) 



Figure 13.1 Representation of points on a sphere by projection onto the equatorial 
plane. Note that two points on the sphere correspond to each projected point with 
given coordinates x l . 


On this surface, dz 2 is given by 

dzl = K 2 (C U X dx ') 2 

z 1 

_ K 2 (C U X dx y 
~ (1 - 

and therefore (13.3.1) gives 


— dx 1 = C dx'' dx" + 


dx") 2 

(1 - KC^x”) 


(13.3.3) 
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The metric is then 




(13.3.4) 


A flat space appears here as the special case K — 0. 

This construction makes it obvious that (13.3.4) admits an (N(N + 1 )/2)- 
parameter group of isometries, for both the ( N + 1) -dimensional line element 
(13.3.1) and the “embedding” condition (13.3.2) are manifestly invariant under 
rigid “rotations” of the ( N + 1) -dimensional space, that is, under the trans- 
formations 


x p -> x' p = wy + W z z 


(13.3.5) 


z z' = R z ^ + R z z z 


(13.3.6) 


where the R A n are constants, with 


G, V R%R\ + R- X R\R\ = 


(13.3.7) 


C, V R%R V z + K~ 1 R z p R z z = 0 
C, V R\R\ + K~ l {R z z ) 2 = K ~ 1 


(13.3.8) 

(13.3.9) 


It is convenient to distinguish two classes of simple transformations satisfying 


(13.3.7H13.3.9): 


R\ = R\ = 0 R z z - 1 


(13.3.10) 


where is any N x N matrix with 




These are just rigid “rotations” about the origin: 


(13.3.11 


x ,p = 


(13.3.12) 


(B) R\ = a* R z „ - — KC a v R z z = (1 - KCa p a°) 1/2 (13.3.13) 


i?\ = <^ v - bKC vp a p a p 


13.3.14) 


where a p is arbitrary except that R z „ must be real, that is. 


KC pa a p a a < 1 


(13.3.15; 


1 - (1 - KG a p a ?) 1/2 


KC pa a p a< 


These are “quasi translations,” with 


13.3.16) 


x fp = x p + «/*[(! - KCx p x a ) ljl - bKC oa x p a a ] 


13.3.17 


In particular, these transformations take the origin x p = 0 into a A 
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The existence of isometries (13.3. 17) that take the origin into any point (at 
least within a finite region) means that this space is homogeneous ; any point is 
geometrically like any other point. (Our coordinate system hides this property, 
just as a polar projection map of the earth hides the fact that the curvature of 
the earth is about the same in Massachusetts as at the North Pole.) Also, the 
existence of isometries (13.3.10) that include all rigid “rotations” about the 
origin means that this space is isotropic about the origin. Since the metric is 
homogeneous, and isotropic about the origin, it is isotropic about every point, and 
maximally symmetric. 

We can construct the Killing vectors for this metric by letting the finite 
transformations (13.2.5), (13.2.6) approach the unit transformation. First, consider 
the transformations (A), and let 


= 5 \ + eQ\, |e| ^ 1 

+ c „ or . = o (13.3.18) 

Comparing with (13.1.3), the corresponding Killing vectors are 

fW = Wy (13.3.19) 

Next, consider the transformations (B), and let 

= ea M , |e| 1 

Comparing with (13.1.3), the corresponding Killing vectors are 

^ a (x) = a^l - KG ft yx v ] 1/2 (13.3.20) 

The reader may check that (13.3.19) and (13.3.20) do satisfy the Killing con- 
ditions (13.1.5). There are N (N — l)/2 independent parameters Q" v [that is, N 2 
elements Q^ v , subject to the N(N + l)/2 conditions (13.3.18)] and N parameters 
a**, so this metric admits N(N + l)/2 independent Killing vectors, verifying 
maximal symmetry. 

The geodesics of this metric take a remarkably simple form. From (13.3.4) 
we can readily calculate that the affine connection is 

r^ = (13.3.21) 

so the differential equation for a geodesic is 

+ Kx» - 0 (13.3.22) 

dz 

The solutions are thus linear combinations of sin (t V K) and cos (t \! K) for 
K > 0, or of sinh (t V — K) and cosh (t yj — K) for K < 0. 

We can uncover the inner properties of this space by calculating the curvature 
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tensor; a straightforward computation using Eqs. (6.6.2) and (13.2.21) gives the 
Riemann-Christoffel tensor for the metric (13.3.4) as 

= k[c K 'C v „ - c t ,c„i 

+ K 2 [ 1 - KC^x^Y 1 [C Ka x y x p - C Kp x y x„ + C yp x K x, - C yp x p x K ] 
(where X y = C^X*), or 

^ Kvpo X[{Jp v {fKir 9av@Kp\ 

in agreement with Eq. (13.2.9). Hence the constant K introduced in Eqs. (13.3.1) 
and (13.3.2) is the same as the curvature constant introduced in the last section. 

Since K is an invariant parameter, we cannot by a coordinate transformation 
convert the metric (13.3.4) into a similar metric with a different K. In contrast, 
Eq. (13.3.3) makes it obvious that by a linear transformation 

r' v 


we can convert the metric (13.3.4) into a similar metric with the same K and 
with C changed into 

c; v = A> p A%C p „ 


Our discussion in Section 3.6 shows that in this way C flv can be changed into any 
real symmetric matrix we like, as long as we do not change the numbers of its 
positive and negative eigenvalues. Also, the numbers of eigenvalues of each sign 
of the matrix G are the same as for the matrix g at the point x = 0, and hence 
the same everywhere, since all points are equivalent. 

An A-dimensional metric that allows the introduction of locally Euclidean 
(as opposed, say, to Minkowskian) coordinate systems will have all its eigenvalues 
positive, so for Z ^ 0 we can take C^ v as |A| -1 times the unit matrix, in which 
case (13.3.3) becomes 


or 


ds 2 = A" 1 


dx 2 + 


(x • dx) 2 
1 - x 2 


for K > 0 


(13.3.23) 


ds 2 = 



(x • dx) 2 
1 + x 2 


for K < 0 


(13.3.24) 


For K — 0, we take C^ v as just the unit matrix, and (13.3.3) gives 

ds 2 = dx 2 for K = 0 (13.3.25) 


(We are using an obvious A-dimensional vector notation. Also, we have replaced 
— dx 2 with a proper length ds 2 , because for the moment we are doing geometry 
rather than physics.) Let us explore the global properties of these spaces. 

For K > 0, our most convenient approach is to go back to the interpretation 
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of (13.3.23) as the metric of the curved space embedded by Eq. (13.3,2) in the 
flat space (13.3.1); that is, (13.3.23) describes the surface 


in the flat space with 


x 2 + z 2 = 1 
ds 2 = K~ 1 [dx 2 + dz 2 j 


(13.3.26) 

(13.3.27) 


Obviously this metric simply describes the surface of a sphere of radius A 1/ 2 in 
an (N + 1) -dimensional Euclidean space. (To make the coordinates x and z 
truly Euclidean, we should define x' = K~ 1/2 x and z f — K~ 1/2 z, in which case 
(13.3.26) reads x' 2 + z' 2 = K~ 1 .) Indeed, in two dimensions we can introduce 
angular coordinates 9, (p by : 


x 1 = sin 9 cos (p x 2 = sin 9 sin (p 

and (13.3.27) then becomes the familiar line element on a sphere of radius K~ 1/2 : 

ds 2 = K~ x \dQ 2 + si n 2 9 dip 2 ] (13.3.28) 


In general, the range of the variables x is 


x 2 < 1 


However, each x actually corresponds to two points, corresponding to the two 
roots of Eq. (13.3.26) for z. (For instance, in two dimensions the components of 
x are the coordinates of points on a sphere projected on a tangent plane; in a 
polar projection map of the earth, Boston will appear at the same point as San 
Carlos de Bariloche, Argentina.) The volume of the A-dimensional space described 
by (13.3.23) is therefore 


V, = 


dx l 


’ dxjy — 


2K~ n I 2 


x 2 < 1 


f 

Jx 2 <1 


dx x • • • ax } 

[1 - x 2 ] 


2 - 11/2 


A straightforward calculation gives 


V 


N 


2n iN+1)/1 

r ((N +~i)/2) 


jsr*/ 2 


(13.3.29) 


For instance, ~ 2nK~ i/2 , which is just the perimeter of a circle of radius 
A -1 / 2 , and V 2 — 4tnK~ l , which is just the area of a sphere with radius K~ 1/2 . 
A three-dimensional space of constant positive curvature has the volume 

V 3 = 2n 2 K~ 3/2 

We can also calculate the circumference of such spaces, using for the geodesics 
the solutions of Eq. (13.3.22), which now reads 

^ + Kx = 0 (13.3.30) 

ds 2 
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The solutions that pass through the point x = 0 are 

x = e sin (siT 1 / 2 ) (13.3.31) 

where, in order to satisfy (13.3.23), 

e 2 = 1 (13.3.32) 

As we go out along a geodesic from the “North pole” x = 0, we reach the “equator” 
x = e at <9 = tzK~ 1/2 I2, we reach the “South pole” x = 0 at s = nK~ 1/2 , we 
reach the opposite point x = — e of the “equator” at s = 3nK~ 1/2 j2, and we 
return to our starting point at s = 2nK~ 1/2 . Thus the distance from any point 
around the whole space and back to itself along a geodesic is 

L = 2nK -V 2 (13.3.33) 

for spaces of constant positive curvature and arbitrary dimensionality. This 
calculation shows very clearly that the space described by (13.3.23) is finite, but 
it is not bounded', when we come to the apparent singularity at x 2 = 1, we con- 
tinue right through, but with 2 given by the root of Eq. (13.3.26) of opposite sign. 

For K < 0 the metric (13.3.24) does not even have an apparent singularity, 
and there is nothing to restrict the coordinates x to any finite range. This can be 
seen even more definitely by calculating the geodesics, which are now given by 
Eqs. (13.3.30) and (13.3.24) as 

x = e sinh (s(-A) 1/2 ) (13.3.34) 

e 2 = 1 (13.3.35) 

We can obviously go out along this geodesic an unlimited distance from the 
origin. For N = 2, this space is just that discovered by Gauss, Bolyai, and 

Lobachevski. [See Section 1.1. In order to put the metric in the form (1.1.9) 

of Klein’s model, it is necessary to introduce a new set of coordinates x n , defined 
by x' = x(l + x 2 )~ 1/2 .] We see from (13.3.1) and (13.3.2) that this geometry 
describes the surface 

-x 2 + 2 2 - 1 (13.3.36) 

in a fiat space, with 

ds 2 = | K\~ 1 [dx 2 — dz 2 ] (13.3.37) 

The minus sign in (13.3.37) means that this flat space is not Euclidean. It is there- 
fore understandable that the Gauss-Bolyai-Lobachevski geometry could not be 
discovered until geometers had learned to think of curved surfaces, not as sub- 
spaces of an ordinary Euclidean space, but as spaces characterized by their own 
inner metric relations. 
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Finally, let us return to space-time, and consider the structure of a four- 
dimensional maximally symmetric metric with three positive and one negative 
eigenvalue. In this case, we can set 


and the metric is 


dx 2 = dx 2 — dt 


2 K(x ■ dx — t dt) 2 
1 — K(x 2 — t 2 ) 


(13.3.38) 

(13.3.39) 


For K > 0, we can introduce coordinates in which the metric appears spatially 
flat, by setting 

( = -)= cosh(iT 1/2 <') + ^1 + sinh(iL 

x = x' exp(isr 1/2 (') (13.3.40) 

Then (13.3.39) becomes 

dx 2 = dt' 2 - exp(2 K ll2 t')dx' 2 (13.3.41) 

We can also introduce coordinates in which the metric appears time-independent, 
by setting 

t" = t' — In [1 - Kx' 2 exp(2X 1/2 t')] 

2 K 1,2 L ^ 

x" = x' exp(if 1/2 £') (13.3.42) 

Then (13.3.41) becomes 

dx 2 = (1 - Kx" 2 ) dt" 2 - dx" 2 - Z(X " ' dx 'X (13.3.43) 

1 - Kx" 2 ' 


This metric was first discussed in this form by deSitter ; 2 it will provide the basis 
for our treatment of the steady state cosmology in Chapter 14. 

Once again, it should be stressed that the maximally symmetric metric 
(13.3.4), although derived by an apparently arbitrary procedure, actually repre- 
sents the most general possible maximally symmetric metric, because the unique- 
ness theorem of the last section tells us that any other maximally symmetric 
metric can be converted into the form (13.3.4) by a suitable coordinate trans- 
formation. 


4 Tensors in a Maximally Symmetric Space 

The assumption of maximal symmetry can be applied, not only to the metric 
of a space, but to any tensor fields that inhabit the space. A tensor field T 
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is said to be form-invariant under a transformation x x f if T' . .(xf is the 
same function of its argument x fp as T „ v . . .(x) was of its argument x that is, 

..(y) = . -iy) for aI1 y (13.4.1) 

At any given point, the transformed tensor is given by the usual formula 


T. 


.{x) = 


dx ,p dx' a 
dx^dx 7 


• t: 


.(*') 


so the form -invariance condition (13.4.1) reads 

(13A2) 

For an infinitesimal transformation 


x tp = & + 8 Wx) | fi | <* 1 


the condition (13.4.2) becomes, to first order in e, 


0 - 


d£ p (x) 

dx p 


pv 




+ £ ‘ W X? T - 


...(*) 

(13.4.3) 


(That is, the Lie derivative of T ^ . . . with respect to c A vanishes; see Section 10.9.) 
A tensor in a maximally symmetric space, which satisfies (13.4.3) for all N(N -f l)/2 
independent Killing vectors C X (x), will be called maximally form-invariant. 

For a scalar S(x), Eq. (13.4.3) reads simply 


« A (») A 8 w = 0 

dx x 


(13.4.4) 


If the scalar is maximally form -invariant, then £ x (x) can at any given point be 
chosen to have any value we like, and (13.4.4) therefore requires that S be constant : 



(13.4.5) 


r or any other maximally form-invariant tensor, it is convenient first to 
choose a Killing vector £ x (x) that at a given point X satisfies 


and for which the quantities 


e(X) = o 


£.»(*) = gjx) 


. 3** J*=x 
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form an arbitrary antisymmetric matrix. Equation (13.4.3) then reads, at x = X: 
0 = +d\T;... +•■•} 

Since ^ a . x is an arbitrary antisymmetric matrix, its coefficient must be symmetric 
in cr and t : 


61 T%.. . + d\T ;. . . + ■ • • = s;t\. . . + S'TJ. .. + ••■ 

(13.4.6) 

Since X was arbitrary, this must hold everywhere. 

For a maximally form-invariant vector A^x), Eq. (13.4.6) reads 

W = d;A' 

Contracting t with p, we find that in N dimensions 

NA a - A a 

so, except for the trivial case N — 1, we must have 

A° = 0 (13.4.7) 

For a maximally form-invariant tensor B of second rank, Eq. (13.4.6) reads 
d x B a + S Z B a = 5 a B x + 5 a B T 

U jl v ■ /i U ll ■ LJ V > U V /l 

Contracting x with p gives 

NB\ + B v ° = B* v + d%B/ 


or, lowering the a index, 


(N - 1 ]B„ + B„ = (13.4.8) 

Subtracting the same equation with v and a interchanged yields 

(tf - 2)(5, v - BJ = 0 

so as long as N 4 2, the tensor B ay must be symmetric: 

(13.4.9) 

(In two dimensions, B av can have an antisymmetric part proportional to g~ 1/2 e <xv ; 
see Section 4.4.) Using (13.4.9) in (13.4.8) gives now for N > 3 (and for the 
symmetric part of B ay for N = 2) 


Arv = fffav 


where 


(13.4.10) 
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To determine the dependence of / on the coordinates, we can use (13.4.10) back 
in the form-invariance condition (13.4.3): 


° + + ^ 8x l (/? " v) 


But g satisfies the Killing condition (13.1.4), so this becomes 


o = g„£ 


x%_ 

8x l 


In a maximally symmetric space we can at any given point choose tp to have any 
value we like, and therefore 


ri = 0 (13.4.H) 

dx* 

Thus the only maximally form-invariant tensor of second rank is the metric tensor , 
times a possible constant. 


5 Spaces with Maximally Symmetric Subspaces 

In many cases of physical importance, the whole space (or space-time) is not 
maximally symmetric, but it can be decomposed into maximally symmetric 
subspaces. For instance, a spherically symmetric three-dimensional space can be 
decomposed into a family of spherical surfaces centered on the origin, each of 
which is described by a metric of the form (13.3.28). Also, in Chapter 14 we shall 
deal with space-times in which the metric is spherically symmetric and homo- 
geneous on each “plane” of constant time. 

We shall see here that the maximal symmetry of a family of subspaces imposes 
very strong constraints on the metric of the whole space. In order to state and 
prove these results, let us first adopt a suitable coordinate system. If the whole 
space has N dimensions and its maximally symmetric subspaces have M dimen- 
sions, then we can distinguish these subspaces from each other with N — M 
coordinate labels v a , and locate points within each subspace with M coordinates 
u l . Some illustrations are given in Table 13.1. 


Table 13.1 Examples of Spaces with Maximally Symmetric Subspaces 


Example 

v- Coordinates 

^-Coordinates 

Spherically symmetric space 

r 

6, (p 

Spherically symmetric space-time 

r , t 

0, (p 

Spherically symmetric and 



homogeneous space-time 

t 

r, 6, (p 
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We say that the subspaces with constant v a are maximally symmetric if the 
metric of the whole space is invariant under a group of infinitesimal transforma- 
tions 

u l -► u' 1 = u l + 8^ l (u, v) (13.5.1) 

v a v' a = v a (13.5.2) 

with M{M + l)/2 independent Killing vectors These are transformations of 
the general form (13.1.3), but with the special feature that the v a are invariant, so 
that 

{"(«, v) = 0 (13.5.3) 


Koto that, although these transformations affect only the ^-variables, there is no 
reason why the transformation rules cannot depend parametrically on the labels 
v a of the particular subspace being transformed. Also, our statement that there 
are M(M + l)/2 “independent” Killing vectors should be construed to mean that 
there are M(M + l)/2 Killing vectors that are not subject to any linear relations 
with ^-independent coefficients. 

The general result that governs the structure of such spaces is contained in 
the following theorem: It is always possible to choose the u- coordinates so that 
the metric of the whole space is given by 

— dr 2 = g^ v dx ** dx v — g ab {v) dv a dv b + f{v)gif(u) du l du j 

(13.5.4) 


where g ab {v) and f(v) are functions of the v- coordinates alone, and g^-iu) is a function 
of the ^-coordinates alone that is by itself the metric of an M- dimensional max- 
imally symmetric space. (The summation convention is in force, with a, 6, . . . 
running over the N — M labels of the v- coordinates, and i, j , k, l, . . . running 
over the M labels of the u- coordinates.) 

To begin our proof, we set down the condition that (13.5.1) is an isometry of 
the whole metric g^ v {x). It is convenient to use this condition here in its original 
form (13.1.4) rather than in the more elegant covariant form (13.1.5). Each index 
g, v, p ... in (13.1.4) now runs over the N — M coordinate labels of the v a and the 
M coordinate labels of the u l , so (13.1.4) now yields three separate equations: 
For p = i, o — j we have 


0 = d 3^}g kj(u , V)+ 8 ^ 


du l 


for p — i, (j = a we have 


du J 


g ki (u, v) + {*(«. v) d 9 i f’ V) 
du k 


(13.5.5) 


0 = 


dj k (u, v) 

8u‘ 


9k<,( u ’ v ) + 


d£ k (u, v) 
8v“ 


9ik( u ’ *) + £*(«, ») 


8g ia (u, v) 
8u k 


(13.5.6) 
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and for p = a, a — b we have 


0 = 


^ ») + + f v, v) 8 °fy} 

dv a dv* du K 


(13.5.7) 


Of these three equations, the first simply tells us that g^u, v) must, for each 
fixed set of v a , be the metric of an M - dimensional space, with coordinates u l , 
which admits the Killing vector We assume here that there are M(M -f l)/2 
such independent Killing vectors, so this means that the submatrix g^u, v) is by 
itself a maximally symmetric metric for each set of fixed v a . According to the 
arguments of Section 13.1, it also follows that at any given point u 0 we can find 
Killing vectors £ k (u, v) for which £ k (u 0 , v ) and £ k; i(u 0 , v) take arbitrary values, 
subject only to the requirement that = — £ /;fc . Thus the metric g^u, v) is for 
each v both homogeneous in u and isotropic about any point. 

The other two equations contain information about the other elements g ai 
and g ab , and also about the ^-dependence of the Killing vectors. (This ^-dependence 
is not entirely arbitrary. For instance, it is true that by redefining the ^-coordinates 
we can always arrange that the metric g^u, v) has v - independent Killing vectors 
% l (u), but the Killing vectors £ l (u, v) of the whole space will then in general be 
linear combinations of the £ l (u), with coefficients that can depend on the ^-co- 
ordinates.) In order to disentangle the different information contained in (13.5.6) 
and (13.5.7), it is extremely useful to choose a new set of coordinates u n (u, v) of 
the maximally symmetric subspaces, so that g' ja vanishes. Suppose for a moment 
that we can find a function U k {v\ u 0 ) that satisfies the differential equation 


9,k{V,v)^ = -gJJJ.v) 

dv 

(13.5.8) 

with the initial condition that 


U k (v 0 ; u 0 ) = u 0 k 

(13.5.9) 

at some point v 0 a . The coordinates u n , v ,a are then defined by 


u l = U\v ' ; u') 

(13.5.10) 

v a = v ,a 

(13.5.11) 


In this coordinate system, the metric has 




du l du k du l 

:•> ^ 9i * {u ’ v) + & 1 g “ (u - v) 


du IJ dv' 


dU l (v'\u') (dU k (v';u') 


du' j 


dv' 


V) + g la (U, v') 


and thus (13.5.8) gives 


g'ia = 0 


(13.5.12) 
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Thus we can construct u f - coordinates in which g' ia vanishes, if we can find solutions 
of the differential equation (13.5.8) with arbitrary initial conditions (13.5.9). 

We can rewrite (13.5.8) in the equivalent form 

c)Tl k 

C —- a = -F\(U, v) (13.5.13) 

dv a 

where 

F\(U, v) = g kl (U, v)g ia (U, v) (13.5.14) 

and g lJ is the matrix reciprocal to gq that is, 

= K (13.5.15) 

(The bar is to remind us that the f^-element g lj of the matrix reciprocal to g u is 
not the same as the ^/-element g lj of the matrix g MV reciprocal to g ) When there 
is only one ^-coordinate, as will be the case in our chapters on cosmology, it is 
obvious that (13.5.13) can be solved with aibitrary initial conditions. In the 
general case, we shall have to do some work to prove that (13.5.13) is integrable. 
Our method is the same as in Section 13.2; we try to solve (13.5.13) in a neighbor- 
hood of v 0 with a power series in v — v 0 : 

uk = 1 - v o) ai •••(»- »o)“" (13.5.16) 

n — 0 n I 

Clearly the initial conditions (13.5.9) are satisfied if we choose the n — 0 co- 
efficient as 

c k — u 0 k 

and Eq. (13.5.13) is satisfied to zero order in v — v 0 if we choose 

c k a = -F k „(u a , v 0 ) 

Now, proceeding by mathematical induction, suppose that we are able to choose 
the terms in (13.5.16) up to order (v — v 0 ) n so that (13.5.13) is satisfied to order 
( v — v 0 ) n ~ l . Then we can use these terms to calculate the term in F K a (U, v ) of 
order {v — v 0 ) n . Let us write this term as 

[F k a (U(v, «„), w)] order „ = —J k abl -. ■(,> - v 0 ) bl ■■■ (v - v 0 ) b " 

n ! 

Then (13.5.16) will satisfy (13.5.13) to order (v — v 0 ) n if we choose the term in U 
of order n + 1 as 

[f/>; «„)]. + , = 1 f abl . . - v 0 )°(v - o 0 )‘>. • • (v - v 0 ) b - 

(n + 1)! 

providing that /is symmetric in all its subscripts. Since f k bi . . , b can obviously be 
chosen symmetric in the 6 5 s, it is sufficient to require that it should also be sym- 
metric between a and any b, or equivalently that 

\~- b F k a (U(v,u 0 ),v) 


jorder n— 1 
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should be symmetric in a and b. But U is assumed to satisfy (13.5.13) to order 
(v — so this condition is satisfied if 


du ov 


a(W, V) ~ 

dv b _ i/ — U(v;uq) 


is symmetric in a and b. We thus conclude that (13.5.13) is integrable if 


8F k a {u, v ) , _ 8F k a (u, v) _ 8F k b (u, v) , 

8u‘ t( ’ ’ 8v b 8u l 


8F k b (u, v) 
8v a 


(13.5.17) 

for all u, v. 

In order tb prove that (13.5.17) is indeed satisfied, we return to the Killing 
vector condition (13.5.6). Multiplying with g l1 , we have 


d? 

dv a 


dj™ 

du* 


sjilpk 

S' ’ « L- 


aw 


Also, multiplying (13.5.5) with g* 1 • g jm gives 


so 


d^n 

du * 


du j 


S k g“g Ja 


Sg u 

8u k 


’ 8u k 


& 

8v a 8u j 


ma 


- e 


8g 

8u k 


- i k g lm 


Sgma 

8u k 


Recalling (13.5.14), we can write this as 

si! = f j & _ fk ej! q 

8v“ “ 8u J 8u k 


(13.5.18) 


Now differentiate with respect to v b ; this gives 

8 2 i‘ = F j _d_ (W\ + SJ^dt!_d?dF^,_. t 8 2 F\ 

8v h 8v“ “ 8u J \8v b J 8v b 8u j 8v b 8u k 4 8v b 8u k 

or, using (13.5.18) on the right-hand side, 

S 2 i l = F J F , JX-. + Ft SF S 8 1 - f j d is f _ F J 

dv h dv a a b du j du 1 a du J du 1 a du k du j a du k du j 

dFl ° d Z k , dFk * dFl ar Z 2 F l a k 

dv b du J b du k du 1 du* au k dv b du k 
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But this must be symmetric between a and b, so 


0 - 


F j , 


dT b 

8u J 


+ \-F J a 


_ F J BF\ ar. _ an) dV 

b du j dv b dv a j du l 

d 2 F l b FJ d 2 F\ dF\dF\ 
du k du-i b du k du J du k du l 


dv b du k dv a du k \ 


dF\ dF l b 
du k du l 


(13.5.19) 


We have already remarked that our assumption that there are M(M + l)/2 
independent Killing vectors allows us at any given point to find Killing vectors 
for which £ k vanishes and for which £ k . = g kl d^ l jdu l is an arbitrary antisymmetric 
matrix. In particular, we can at any given point choose so that 

= 0 

d£ l 

£k-,i = 9kl 3T1 = ^km^in ~ ^kn^im 
OX 


Hence multiplying (13.4.19) with g u and setting k — n ^ m, we find that 


F J 


dF» 


du J 


dF\ 

du j 


n- 


dF m b _ d_F\ 
dv a 8v b 


which is just the desired relation (13.5.17). The coefficient of c, 1 in (13.5.19) must 
also vanish, but we do not need this information here. 

To return now to the main line of our proof: Having proved (13.5.17), we 
know that (13.5.13) is integrable, so we can construct the coordinates u n , v ,a 
defined by (13,5.10) and (13.5.11), in which the metric components g' ia vanish. 
Let us do so, and drop the primes, so that now 


9ia = 0 

The Killing vector conditions (13.5.6), (13.5.7) now read 

0 __ £ k ^9 gb 

du k 

Since g ik is nonsingular, it follows from (13.5.21) that 


d? 

dv a 


= 0 


(13.5.20) 


(13.5.21) 

(13.5.22) 


(13.5.23) 
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Also, we have noted that at each point we can find Killing vectors for which £ k 
takes any arbitrary value, so the coefficient of in (13.5.22) must vanish: 


^ = 0 (13.5.24) 

8u k 


It only remains to show that g^u, v) is ^-independent, except for a possible 
factor f(v). We use the fact that for any fixed v 0 there are M(M -f 1 )/2 independent 
Killing vectors of g^u, v 0 ) that, according to (13.5.23), are also Killing vectors 
of gij(u y v ) for any v. Each one of these Killing vectors £ l (u) will then satisfy 
(13.5.5) at v — v 0 and for general v: 


0 = 

0 = 


^ *o> + ^ 9^, v Q ) + {‘(u) "o’ 

du l a,fJ 

d£ k {u) 


du J 
d^ k (u) 


du 


r ?«(«, ») + - 5 -y g ki (u, v) + {»(«) 
1 8u J 


Su k 

dg tl (u, v) 

du k 


We can interpret these two equations as saying that g^u, v) is a maximally form- 
invariant tensor [in the sense of Eq. (13.4.3)] in the maximally symmetric space 
with metric g^u, v 0 ). It follows then, according to Eqs. (13.4.10) and (13.4.11), 
that the tensor g^u, v) is proportional to the metric g^u, v Q ), with a ^-independent 
coefficient : 

9i j(“. ») = f( v > v o )9ij( u > v o) 

The valve of v 0 can be fixed in any way we like, so we can suppress the label v 0 , 
and write this as 

g.j(u, v) = f{v)g t] -(u) (13.5.25) 

with 

f{v) = f(v, v 0 ) g tJ (u) = g tj (u, v 0 ) (13.5.26) 

Putting together (13.5.20), (13.5.24), and (13.5.25) now shows that the metric 
g (u, v ) does have the form given by (13.5.4), and (13.5.26) and (13.5.5) with 
v — v 0 show that g t j(u) is a maximally symmetric metric, as was to be proved. 

This theorem could also have been proved under the apparently weaker 
assumption, that the whole space can be decomposed into subspaces that are 
isotropic about every point. This assumption medh.s that any point u 0 , v we can 
find Killing vectors of the whole space with £ a = 0, for which £' vanishes at 
u 0 , v, and for which f t . k at u 0 , v is an arbitrary antisymmetric matrix. In par- 
ticular, we can find M(M — l)/2 Killing vectors £ ( lm) (u , v; u 0 ) with 


f a(Im) (tt, v\ u 0 ) =■ 0 

^ lm \u, v; u 0 ) = v; u 0 ) 
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for which 


f&W v > u o) = 9i 




- W - *,-v 


We can then define 


^ l \u,v;u 0 ) = ^ t>;u 0 ) 


and the arguments of Section 13.1 show that these are Killing vectors of the whole 
space, with 

<J fl( l \u , v\ u 0 ) = 0 

and with 

C w iUo, v;u 0 )= - ^ ^ i g“ (« 0 , v) 


The existence of the M(M + l)/2 independent Killing vectors and 

shows that the space does have maximally symmetric subspaces after all. 

In all cases of practical importance, the maximally symmetric subspaces are 
spaces , as opposed to space-times, so all eigenvalues of the submatrix g.j are 
positive. In this case, we can use (13.3.23), (13.3.24), or (13.3.25) to evaluate 
g tj du 1 du j , and (13.5.3) then gives 

- dx 2 = g ab (v) dv° dv b + f(v) jdu 2 + * ( "J ^ j (13.5.27) 

where f(v) is positive and 

! + 1 if max. sym. subsp. has K > 0 

— 1 if max. sym. subsp. has K < 0 (13.5.28) 

0 if max. sym. subsp. has K = 0 

[We have absorbed the curvature constant |K| -1 appearing in (13.3.23) and 
(13.3.24) into the function /(v).] Let us now use this formula to treat the special 
cases listed in Table 13.1 : 

(A) Spherically Symmetric Space. Suppose that the dimensionality of the 
whole space is A — 3, that all eigenvalues of its metric are positive, and that it has 
maximally symmetric two-dimensional subspaces with positive curvature. Then 
there is one ^-coordinate, which we can call r, and two u- coordinates, which we 
can replace with angles 6, (p defined by 


u 1 = sin 6 cos (p 


u 2 = sin 6 sin cp 


(13.5.29) 
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Eq. (13.5.27), with k — I, then gives 

ds 2 = g(r) dr 2 + f(r) {d6 2 + sin 2 8 dtp 2 } (13.5.30) 

with f(r) and g(r) positive functions of r. 

(B) Spherically Symmetric Space-Time. Suppose that the dimensionality of 
the whole space- time is N — 4, that three of the eigenvalues of its metric are 
positive and one is negative, and that it has maximally symmetric two-dimensional 
subspaces whose metric has positive eigenvalues and positive curvature. Then 
there are two ^-coordinates, which we can call r and t , and two ^-coordinates, 
which can be replaced with 9 and (p as in (13.5.29). Eq. (13.5.27), with k — 1, 
then gives 

- t) dt 2 + 2 g rt {r, t) dr dt + g rr (r, t) dr 2 

4- f(r, t) {d6 2 + sin 2 6 dtp 2 } (13.5.31) 

where /(r, t) is a positive function and g^iy, t) is a 2 x 2 matrix with one positive 
and one negative eigenvalue. 

(C) Spherically Symmetric Homogeneous Space-Time. Suppose that the dimen- 
sionality of the whole space-time is N = 4, that three of the eigenvalues of its 
metric are positive and one is negative, and that it has maximally symmetric three- 
dimensional subspaces whose metric has positive eigenvalues and arbitrary curva- 
ture. Then there is one ^-coordinate and three ^-coordinates, and (13.5.27) gives 

- = gi v) + m W + 

( 1 - kn j 

where f(v) is a positive function and g(v) is a negative function of r. It is very 
convenient to define new coordinates t , v, 8, cp by 

j*(- g(v ))' 12 dv = t 

u 1 ~ r sin 8 cos cp 
u 2 = r sin 8 sin (p 
u 3 = r cos 8 

We then have 


dx 2 = dt 2 — R 2 {t) { — — — - + r 2 dd 2 + r 2 sin 2 8 dcp 2 \ (13.5.32) 
\l — kr 2 j 

where R(t) = y/ f(v). 

The first two examples show how it is possible to capture the essence of 
spherical symmetry by giving a qualitative description of a space or space-time 
in terms of dimensionalities, signs of eigenvalues and curvatures, and the maximal 



4 04 


1 3 Symmetric Spaces 


symmetry of its subspaces. The metrics (13.5.30) and (13.5.31) are just what we 
would have expected on more elementary grounds; indeed, (13.5.31) was our 
starting point in Section 11.7. 

On the other hand, our third example leads to a result that could not easily 
have been anticipated. It is true that Eq. (13.5.32) was already derived in Section 
11.9 as the metric inside a spherically symmetric collapsing star of uniform density 
and zero pressure. The beautiful new thing we have learned in this chapter is that 
this metric can be derived solely from the assumption of homogeneity and 
isotropy, with no use of the Einstein field equations. 
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PART FIVE 
COSMOLOGY 




“We wonder, Oh, we wonder, 
what on earth the world 
may be ?” 

W. S. Gilbert , The Mikado 


14 COSMOGRAPHY 


Modern science began with the discovery that the earth is not at the center of 
the universe. Antianthropocentrism has become incorporated into the scientific 
mentality, and no one now would seriously suggest that the earth, or the solar 
system, or our galaxy, or our local group of galaxies, occupies any specially 
favored position in the cosmos. Rather, our intuition now runs in precisely the 
opposite direction. A large portion of modern cosmological theory is built on the 
Cosmological Principle , the hypothesis that all positions in the universe are 
essentially equivalent. Of course, the homogeneity of the universe has to be under- 
stood in the same sense as the homogeneity of a gas; It does not apply to the 
universe in detail, but only to a “smeared- out” universe averaged over cells of 
diameter 10 8 to 10 9 light years, which are large enough to include many clusters of 
galaxies. Also, it appears that the universe is spherically symmetric about us, so 
included in the Cosmological Principle is the assumption that the “smeared” 
universe is isotropic about every point. 

The question still remains, whether the universe is spherically symmetric and 
homogeneous at all times, or merely over some temporary present phase of its 
history. There has been an interesting suggestion, to be discussed in Section 15.11. 
that the universe may have been highly anisotropic during some dense early phase, 
but that the anisotropies have since been largely smoothed out through the action 
of neutrino viscosity and other dissipative effects. However, even in such theories, 
the universe has been highly isotropic and homogeneous over all of that part of its 
history which is directly accessible to astronomical observation. 
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This chapter will outline and apply a mathematical framework for the descrip- 
tion of the universe, based entirely on the Cosmological Principle, and on those 
parts of general relativity that follow directly from the Principle of Equivalence. 
(This includes Chapters 2 to 6 and Chapter 13.) I shall first show that the Cos- 
mological Principle allows the specification of the cosmic metric entirely in terms 
of a “radius” R(t) and a trichotomic constant k [as in Eq. (13.5.32)] and we shall 
then see how astronomical observations can be interpreted as measurements of 
R(t) and k. 

This inherently kinematic approach, pioneered in the 1930’s by H. P. 
Robertson 1 and A. G. Walker, 2 is incomplete, in that it does not provide an 
a priori prediction of the function R{t). To calculate R(t) we need to make some 
assumption about the material content of the universe, and then derive the 
Robertson-Walker metric as a solution of the Einstein field equations, as first done 
by Alexandre Friedmann 3 in 1922. Our discussion of the contents of the universe, 
and the use of the Einstein field equations, will be postponed until the next chapter, 
on cosmology. 

Why make this distinction between cosmography and cosmology ? The reason 
is simply that we do not know the equation of state of the matter and radiation of 
the universe throughout its history, and even if we did, we could not be sure that 
the Einstein equations really hold over cosmic times and distances. A modification 
of the field equations or the equation of state, such as the introduction of a Brans- 
Dicke field, a cosmological constant, or a large population of neutrinos or gravitons, 
would affect the function R(t) and invalidate the simplest Friedmann solution, but 
it would not require us to make any change in the descriptive framework assembled 
in this chapter. 

There remains the possibility that the universe is not homogeneous and isotropic 
after all. It might be homogeneous but not isotropic, as in the model of K. Godel. 4 
However, the cosmic microwave radiation discussed in Chapter 15 appears to be 
highly isotropic. (The universe cannot be isotropic about every point without also 
being homogeneous, as shown in the last chapter.) A more radical notion is that 
there is no “smeared” universe at all, but only clusters of galaxies, and clusters of 
clusters, and clusters of clusters of clusters, and so on, as in the hierarchical model 
proposed in 1908 by C. V. I. Charlier. 5 Empirical arguments for such super- 
clustering have been offered by G. de Vaucouleurs, 6 but the work of F. Zwicky, 7 
G. O. Abell, 8 and J. H. Oort 9 indicates that the hierarchy stops at clusters of 
galaxies or at most clusters of clusters of galaxies, and shows no evidence of 
inhomogeneities of larger scale. 

The real reason, though, for our adherence here to the Cosmological Principle 
is not that it is surely correct, but rather, that it allows us to make use of the 
extremely limited data provided to cosmology by observational astronomy. If we 
make any weaker assumptions, as in the anisotropic or hierarchical models, then 
the metric would contain so many undetermined functions (whether or not we use 
the field equations) that the data would be hopelessly inadequate to determine 
the metric. On the other hand, by adopting the rather restrictive mathematical 
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framework described in this chapter, we have a real chance of confronting theory 
with observation. If the data will not fit into this framework, we shall be able to 
conclude that either the Cosmological Principle or the Principle of Equivalence is 
wrong. Nothing could be more interesting. 


1 The Cosmological Principle 

The Cosmological Principle is the hypothesis that the universe is spatially 
homogeneous and isotropic. Before applying this principle, we shall have to 
formulate our intuitive ideas of homogeneity and isotropy in precise mathematical 
terms. 

Eirst, let us fix our attention on one particular space-time coordinate system, 
of the sort that might be used by terrestrial cosmographers. Spatial coordinates x l 
can be constructed with origin x 1 = 0 at the center of the Milky Way, with 
coordinate directions fixed by the line of sight from the Milky Way to some 
typical distant galaxies, and with a scale of distances defined by the apparent 
luminosities of distant galaxies, or other suitable objects, as seen from the Milky 
Way. To define a time coordinate, it is convenient to use the evolving universe itself 
as a clock. It is believed that several cosmic scalar fields, such as the proper energy 
density p , or the black-body radiation temperature T y (see Chapter 15) are 
everywhere decreasing monotonically ; choose any one of these, say a scalar S, and 
let the time of any event be any definite decreasing function t(8) of the chosen 
scalar, when and where the event occurs. (We shall have to reopen the question of 
how to define the time when we consider a steady state universe, in Section 14.8.) 
The coordinates x, t so defined will be called the cosmic standard coordinate system. 

The Cosmological Principle can be formulated as a statement about the 
existence of equivalent coordinate systems. Suppose that we use the cosmic 
standard coordinate system to carry out astronomical observations, determining 
(never mind how!) the metric tensor the energy-momentum tensor T and all 
other cosmic fields, as functions of the cosmic standard coordinates x* 1 . A different 
set of space-time coordinates x' u may be considered equivalent to the cosmic 
standard coordinates, if the whole history of the universe appears the same in the 
x rfl coordinate system as in the cosmic standard coordinate system. This requires 
that every cosmic field T' MV (x'), and so on, must be the same function of the 

x ,fl as the corresponding quantities g^Jx), T„Jx), and so on, are of the standard 
coordinates x*. That is, at any coordinate point we must have 

= ?i,(y) (U.i.i) 

T,Ay) = T'„to) etc. (14.1.2) 

In the language of the last chapter, Eq. (14.1.1) says that the coordinate trans- 
formation x — ► x' must be an isometry , and (14.1.2) says that T , and so on. must 
be form-invariant under this transformation. 
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In particular, Eq. (14.1.2) will have to hold for the scalar S used to define our 
cosmic standard time t. Since S is by definition a function only of t , and a scalar, 
Eq. (14.1.2) for $ reads, at y — x', 

S(t') = S'(x') = S{x) = S(t) 

and so 

V = t (14.1.3) 

All coordinate systems that are equivalent to the cosmic standard system 
necessarily use cosmic standard time. 

The assumption of spatial isotropy can now be formulated as the requirement 
that there exists a family of coordinate systems x rfl (x; 6), depending on three 
independent parameters Q 1 , 0 2 , 6 3 , which are equivalent to the cosmic standard 
coordinates, and which have the same origin, that is, 


x fi (0, t;6) = 0 (14.1.4) 

We can intuitively think of the three parameters 6 n as Euler angles that specify 
the orientation of the x ri coordinate axes relative to the x l coordinate axes, but it 
is unnecessary to be so specific; the important thing is that there be three inde- 
pendent parameters. (In formulating this assumption, we have tacitly assumed 
that the privileged Lorentz frame in which the universe appears isotropic happens 
to coincide more or less with our own galaxy.) 

It is a little trickier to formulate the assumption of homogeneity. Clearly, 
homogeneity does not mean that any object can be chosen as the origin of a 
coordinate system equivalent to our cosmic standard coordinates — after all, the 
universe looks different to an observer moving away from the Milky Way at half 
the speed of light than it does to us ! The most we can expect is that every point x M 
in space-time is on some “fundamental trajectory” x l = X l (t), which can serve 
as the origin of a coordinate system x ,fl equivalent to the cosmic standard system. 
(This is closely related to a postulate called Weyl’s principle, used in some formula- 
tions of cosmology.) The Miffrg Way appears to be a rather ordinary galaxy, more 
or less at rest with respect to its nearest neighbors, so we can expect that the 
fundamental trajectories X(t) are pretty well defined by the motions of typical 
members of the cosmic gas of galaxies, but this is by no means an essential part of 
the assumption of homogeneity. The important point is that, since the X(t) at 
any time t fill up all space, they are determined by three independent parameters 
a 1 , which can be taken, for instance, as the values a 1 = X‘(T) of X l (t) at some 
particular time t — T. Thus homogeneity means that there is a three -parameter 
set of coordinates x ,tl (x; a), which are equivalent to the cosmic standard co- 
ordinates and which have origin on the trajectory x l = X l {t\ a), that is, 

x n \X(t: a), t: a) = 0 (14.1.5) 

To be more precise the X(t; a) are the trajectories of the privileged observers to 
whom the universe appears isotropic. 
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Putting this together, we see that the Cosmological Principle entails the 
existence of two independent three-parameter families of coordinate transforma- 
tions x x', x -> x', which are isometries in the sense of Eq. (14.1.1), and which 
according to Eq. (14.1.3), leave the time coordinate invariant. The universe 
therefore satisfies the requirements assumed in Section 13.5 for a four-dimensional 
space with three-dimensional maximally symmetric subspaces t — const. 

(To see this in detail, we can descend to the case of infinitesimal transforma- 
tions, letting 6 l and a 1 approach zero. There are then six “Killing vectors” ^(x) 
and fj( x ), defined by 


£}(*) 


Vj{x) 


dx'\x\ 6) 


d9 J 

dx n (x; a) 


da 3 


e = o 


a = 0 


= 0 


(14.1.6) 

(14.1.7) 


It is only necessary to show that these six vectors are independent. Suppose that 
they satisfy a linear relation 

Z + X cJ{t) i‘j(x) = 0 (14.1.8) 

j j 

At the origin, Eqs. (14.1.4) and (14.1.5) give 


«( 0 , t) = 0 


l}(0, t) = 

so at x l — 0, Eq. (14.1.8) gives 

Z 


dX l (t, a) 


da 3 

dX l (t, a)' 


da j 


a = 0 


= 0 


< 2=0 


Since the a 1 are independent parameters, this requires that 


(14.1.9) 

(14.1.10) 


c 3 (t) = 0 


(14.1.11) 


Going back to (14.1.8) and (14.1.6), we then have 


Z * J (0 


dx u (x; 6) 

~~dO J 


1 =0 

0 = 0 


and since the 6 l are independent parameters, this requires that 


c J (t ) = 0 (14.1.12) 

Thus there are six independent Killing vectors with = 0, the maximum 
number possible (see Section 13.1) in three dimensions.) 
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In summary, the Cosmological Principle can be formulated in the language of 
Chapter 13, as follows: 

(i) The hypersurfaces with constant cosmic standard time are maximally 
symmetric subspaces of the whole of space-time. 

(ii) Not only the metric g^ v , but all cosmic tensors such as T ^ v , are form- 
invariant with respect to the isometries of these subspaces. 


2 The Robertson- Walker Metric 

The formulation of the Cosmological Principle given in the last section allows 
us to apply the results of Section 13.5 for spaces with maximally symmetric 
subspaces. We see immediately that it must be possible to choose coordinates 
r, 6 , 0, t, for which the metric takes the form given in Eq. (13.5.32): 

dx 2 = dt 2 — R 2 (t ) | — — — - + r 2 d6 2 + r 2 sin 2 6 d<f) Z 1 (14.2.1) 

[1 - hr 2 j 

where R(t) is an unknown function of time, and Jc is a constant, which by a suitable 
choice of units for r can be chosen to have the value + 1, 0, or — 1. (These are not 
necessarily the same as the cosmic standard coordinates introduced in the last 
section, although t in Eq. (14.2.1) is the cosmic standard time, or a function of it.) 
The metric (14.2.1) is known in cosmology as the Robertson-W alker metric. 

It is interesting to consider the geometrical properties of the three-dimensional 
spaces of constant t. These have metric 

3 Sr , = 3 <fee = r 2 R 2 (t) = r 2 8R 2 (t) (14.2.2) 

1 — tCT 

with 3 g fiv vanishing for p ^ v. Comparing with (13.3.23)-(13.3.25) shows that the 
three-dimensional curvature scalar is 

3 K(t) = kR~ 2 {t) (14.2.3) 

For k = — 1 or & — 0 the space is infinite, while for k — + 1 it is finite (though 
unbounded), with proper circumference given by Eq. (13.3.33) as 

3 L = 2nR(t) (14.2.4) 

and proper volume given by Eq. (13.3.29) as 

3 V = 2 n 2 R 3 (t) (14.2.5) 

For k = +1 the spatial universe can be regarded as the surface of a sphere of 
radius R(t) in four-dimensional Euclidean space (see Section 13.3), and R(t) can 
justly be called the “radius of the universe." For k = — 1 and k — 0 no such 
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interpretation is possible, but R(t) still sets the scale of the geometry of space, so 
R(t) will in all cases be called the cosmic scale factor. 

The construction in Section 13.5 of the coordinates r, 6 , <j>, t was carried out in 
such a way that the coordinate transformations, which leave the four-dimensional 
metric (14.2.1) form-invariant, are just the purely spatial transformations that 
leave (14.2.2) form -in variant. These include the rigid rotations 

x fi = (ij = 1 , 2, 3) (14.2.6) 

where R is an arbitrary orthogonal matrix (and as usual x 1 = r sin 0 cos 0, 
x 2 =2 r sin 6 sin 0, x 3 = r cos 0), together with “quasitranslations,” given by 
setting KC equal to k times the unit matrix in Eq. (13.3.17) : 

x' = x + a j(l - £x 2 ) 1/2 - [1 - (1 - fca 2 ) 1/2 ] (^T^j (14.2.7) 

where a is an arbitrary three-vector. 

The transformation (14.2.7) carries the origin into the point a, so we can 
conclude that any fixed point can serve as the origin of a system of coordinates 
equivalent to the coordinate system of Eq. (14.2.1). That is, the “fundamental 
trajectories” of observers, to whom the universe looks the same as it does to us, 
are just X(£; a) = a. We have already noted in the last section that the funda- 
mental trajectories ought to be close to the paths of “typical” galaxies, so we can 
tentatively conclude that the spatial coordinates r, 6 , 0 form a comoving system , 
in the sense that typical galaxies have constant spatial coordinates r, 6, 0. One can 
imagine the comoving coordinate mesh to be like lines painted on the surface of a 
balloon, on which dots represent typical galaxies. As the balloon is inflated or 
deflated the dots will move, but the lines will move with them, so each dot will 
keep the same coordinates. 

It is important to note that the fundamental trajectories x = const are 
geodesics, because Eq. (14.2.1) gives 


Tg = 0 (14.2.8) 

Thus the statement that a galaxy has constant r, 0, 0 is perfectly consistent with 
the supposition that galaxies are in free fall. Note also that the time coordinate t in 
(14.2,1) is not only a possible “cosmic standard” time in the sense of the last 
section; it is also the proper time told by a clock at rest in any typical freely 
falling galaxy. The coordinates x, t are thus co-moving in precisely the same sense 
as the “Gaussian normal” coordinates introduced in Section 11.8. 

We can obtain a deeper insight into the behavior of matter in a Robertson- 
Walker universe by applying the Cosmological Principle to the tensors that describe 
the average state of cosmic matter, such as the energy-momentum tensor and 
the current </ G M of galaxies. {J q 1 is defined exactly like the electric current (5.2.13), 
but the sum runs over galaxies instead of particles, and a factor 1 replaces e n .) 
All such tensors are required to be form-invariant [in the sense of Section 13.4 or 
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Eq. (14.1.2)] with respect to those coordinate transformations, such as (14.2.6) 
and (14.2.7), which leave the metric (14.2.1) form-invariant. These “isometries” 
are purely spatial, so they transform J G and T u as three-scalars; J G l and T lt as 
three-vectors ; and T lJ as a three-tensor. According to the theorems proved in 
Section 13.4, this then requires that 

Jg = n G (t) jy = 0 (14.2.9) 

and 

T„ = p(t) T it = 0 T. tj = * guP (t) (14.2.10) 

where n G , p, and p are unknown quantities that may depend on t, but not on r, 6, 
or (f). These results can be written more elegantly as 

jy = n G V^ (14.2.11) 

= (p + P)U„V v + pg„ v (14.2.12) 

where U is a “velocity four- vector” 

U f = 1 (14.2.13) 

U l = 0 (14.2.14) 

Equation (14.2.14) shows that the contents of the universe are , on the average , at rest 
in the coordinate system r, 6, (j), as expected. In addition, comparison of Eq. (14.2.12) 
with Eq. (5.4.2) shows that the energy -momentum tensor of the universe necessarily 
takes the same form as for a perfect fluid. 

It will be useful to have on hand the differential equations for n G (t), p(t), and 
p(t) that are provided by conservation principles. If galaxies are neither created 
nor destroyed, then J c fl obeys the conservation equation (5.2.14): 

0 = (Jo 11 );, = g - 1/2 - c - {g ll2 J o’ 1 ) = g ~ 1/2 f (g 1,2 n 0 ) (14.2.15) 

cx M ot 

The metric (14.2.1) has a determinant — g given by 

g = R 6 (t)r 4 { 1 — kr 2 )~ 1 sin 2 0 (14.2.16) 

and therefore the conservation of galaxies yields the relation 

n G (t)B 3 (t) — constant (14.2.17) 

(Note that n G is the number density per unit proper volume, and therefore increases 
or decreases according as the universe shrinks or expands, while n G R 3 is the 
number density per unit coordinate volume, and therefore remains constant in a 
comoving coordinate system.) The energy-momentum tensor (14.2.12) obeys the 
conservation equation (5.4.3): 

0 = T» v . v 

= '7" v + g~ 1/2 A [? 1/2 (P + PWV^ + TUP + P)U'U l (14.2.18) 

ox ox 
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Using Eqs. (14.2.8) and (14.2.14), we find that this equation is trivially satisfied 
for p = r, 9, (f), while for p — t it reads 

R\t) Ml = ± {R\t)[p(t) + P (t)]} (14.2.19) 

dt dt 

For instance, if the pressure of cosmic matter is negligible then (14.2.19) gives a 
result analogous to Eq. (14.2.17): 

p(t)B 3 (t) = constant (14.2.20) 

The very great convenience of a comoving coordinate system should not blind 
us to the fact that the typical galaxies actually do move further apart or closer 
together when B(t) increases or decreases. To see this clearly, we need to consider 
what we mean by the distance between galaxies. Imagine a chain of typical galaxies 
lying close together on the line of sight between us and a distant galaxy at r t , 0 l3 
<j) l3 and suppose that at the same cosmic time t, observers in each galaxy measured 
the distance to the next galaxy, say by measuring the travel time for light signals. 
(Note that this is not the same as measuring the time for a single light signal to go 
from r = 0 to r = r l .) Adding up all these subdistances gives the proper distance 

(* r i , fri r J r 

tfprop(*) = >l9rr dr = ^(0 - 7 = (14.2.21) 

Jo Jo \/l — hr 2 

Obviously, no one is going to organize this sort of cosmic conspiracy, so the proper 
distance is not very relevant to observational cosmology. However, we shall see in 
Section 14.4 that the more relevant measures of distance, based on apparent 
luminosities and angular diameters, all approach the proper distance (14.2.21) 
for r 1 1. Thus, in one sense or another, galaxies do move apart when B(t) 
increases, or together when B(t) decreases. 

Cosmological theory presents observational astronomy with the challenge of 
measuring the function B(t) and determining whether k is + 1, or 0, or — 1. This is 
not all there is to cosmology, but it is a central problem that must be solved if we 
are to understand the universe. The balance of this chapter will describe how well 
this challenge has been met. 


3 The Red Shift 

Our most important information about the cosmic scale factor B(t) comes to us 
through the observation of shifts in frequency of light emitted by distant sources. 
To calculate such frequency shifts, we shall place ourselves at the origin r = 0 of 
coordinates (according to the Cosmological Principle, this is a mere convention) 
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and consider an electromagnetic wave traveling to us along the — r direction, with 
6 and (p fixed. The equation of motion of a given wave crest is then 

dr 2 

0 — dz 2 = dt 2 — R 2 (t) 

1 — hr 2 


Hence if the wave leaves a typical galaxy, located at q, 0 l5 <f) l , at time t v , then it 
will reach us at a time t 0 given by 


f ° dt 

u R(t) 


where 


/w . 


dr 


Vi — 


= /(»-l) 


(14.3.1) 

/sin - 1 r 1 

k = +1 


k 

k = 0 

(14.3.2) 

(smh 1 r 1 

k = -1 



We saw in the last section that typical galaxies will have constant coordinates 
r lf 0 l5 so f(r { ) is time-independent. Hence, if the next wave crest leaves r x at 
time + dt x , it will arrive here at a time t 0 + St 0 , which again is given by a 
relation like (14.3.1) 


' , fo + <5*o 

ti+dtx -^(0 


f( r l) 


(14.3.3) 


Subtracting Eq. (14.3.1) from (14.3.3), and noting that R(t) changes very little 
during the period 10 -14 sec of a typical light signal, we find that 

dt 0 5t 1 
R{t 0 ) ^(^ 1 ) 


The frequency v 0 observed here is thus related to the frequency v, when emitted 

by 


Vo __ &C _ ) 

Vi St 0 R(t 0 ) 


(14.3.4) 


This is conventionally expressed in terms of a red-shift parameter z, defined as the 
fractional increase in wavelength 


Z = —k 

'-1 

Since A 0 // n equals v l /v 0 , (14.3.4) gives 


z 


Rtfo) _ y 

Bit,) 


(14.3.5) 


(14.3.6) 


To avoid confusion, it should be kept in mind that v x and are the frequency 
and wavelength of the light if observed near the place and time of emission, and 
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hence presumably take the values measured when the same atomic transition 
occurs in terrestrial laboratories, while v 0 and X 0 are the frequency and wavelength 
of the light observed after its long journey to us. If z > 0 then X 0 > and we 
speak of a red shift ; if z < 0 then A 0 < X 1 , and we speak of a blue shift. 

If the universe is expanding, then B(t 0 ) > R(t x ), and (14.3.6) gives a red shift, 
while if the universe is contracting, then R{t 0 ) < R(tf), and (14.3.6) gives a blue 
shift. Such frequency shifts find a natural explanation in terms of the Doppler 
effect discussed in Section 2.2. Equation (14.2.21) shows that a relatively close 
galaxy will move away from or toward the Milky Way, with a radial velocity 

iV ~ R(t 0 ) ri (14.3.7) 

The frequency shift is given for — »• 0 and t 0 — ► t l by Eqs. (14.3.6) and (14.3.1) as 

z - R{to) {t ° ~ h) - r l R(t 0 ) - v r (14.3.8) 

R(t 0 ) 

in agreement with Eq. (2.2.2). However, the frequency of light is also affected by 
the gravitational field of the universe, and it is neither useful nor strictly correct to 
interpret the frequency shifts of light from very distant sources in terms of a 
special-relativistic Doppler effect alone. [The reader should be warned though, that 
astronomers conventionally report even large frequency shifts in terms of a 
recessional velocity, a “red shift” of v km/sec meaning that z — vj ( 3 x 10 5 ).] 

The first evidence for a systematic red shift of spectral lines from distant 
objects was provided by a program of observations carried out by Vesto Melvin 
Slipher with the Lowell Observatory 24-in. refractor from about 1910 to the mid- 
1920’s. In a 1922 summary, 10 he gave data for 41 spiral nebulae, of which 36 had 
absorption lines shifted to the red by amounts up to z ~ 0.006, and only five 
showed blue shifts, the largest being that of the Andromeda nebula, with z ~ 
— 0.001 . From the beginning these frequency shifts were interpreted as due to the 
Doppler effect, but at first it was expected that they could be accounted for by the 
notion of the solar system, rather than the galaxies. The preponderance of red 
shifts in all parts of the sky made this interpretation increasingly untenable, and 
by 1918 Wirtz 1 1 suggested that in addition to the solar motion there was a general 
recession of spiral nebulae (then called the “A -term”) away from us in all directions. 
Of course, other explanations were possible, such as a gravitational red shift caused 
by very strong local gravitational fields. (Perhaps the triumph of general relativity 
in the 1919 eclipse expedition made this explanation particularly attractive.) 
However, in a series of papers 12 written in the 1920’s, Wirtz and K. Lundmark 
showed that Slipher’s red shifts increased with the distance of the spiral nebulae, 
and therefore could most easily be understood in terms of a general recession of 
distant galaxies, the furthest naturally being those moving fastest. The announce- 
ment by Edwin Hubble 13 in 1929 of a “roughly linear relation between velocities 
and distances” established in most astronomer’s minds the interpretation of the 
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red shift as a cosmological Doppler effect, and this interpretation has survived 
through the decades until the present. 

It is not possible to carry this discussion further, without first sharpening our 
understanding of how cosmological distances are defined, and how they relate to 
the coordinate distance r x . The red shifts will be taken up again in Section 14.6. 


4 Measures of Distance 


There are at present only two practical methods (not counting the measure- 
ment of red shifts) for determining the distance of an object outside our galaxy. 
If we know its absolute luminosity, we can compare it to the observed apparent 
luminosity; or if we know its true diameter, we can compare it to the observed 
angular diameter. In addition, the distance to a near enough object can be deter- 
mined by measuring its parallax, the shift in apparent position in the sky caused by 
the earth’s revolution about the sun, or its proper motion, the shift in apparent 
position in the sky caused by the object’s actual motion relative to the sun. 

The “distances” measured by these four methods are identical for objects that 
are nearer than about 10 9 light years, but beyond this range they differ from 
each other, and also from the “proper distance” defined in Section 14.2. Thus, in 
order to use the correlation between red shifts and apparent luminosities or angular 
diameters to measure R{t) and k, it will first be necessary to express the distances, 
determined from apparent luminosities or angular diameters, in terms of r x and 
t 0 . It will be instructive, if largely academic, to do the same for distances determined 
from measurements of parallax or proper motion. 

In order to calculate parallaxes and apparent luminosities, we must know the 
paths of light rays that leave a source at r v Q v and pass near r = 0. (See 
Figure 14.1.) In a coordinate system x rtl in which the light source is at the origin, 
the ray path is given by the very simple equation : 

x'(p) = np (14.4.1) 


where n is a fixed vector, p is a variable positive parameter describing positions 
along the path (with p = 0 at the source), and x' is a three- vector formed from the 
comoving coordinates r'O'fi' in the usual way: 

x' = (/ sin O' cos (j)', r' sin O' sin fi', r' cos O') 


The transformation between the x ,fX coordinates, and another coordinate system in 
which the light source is at x l5 is given by setting a = x x and interchanging x and 
x' in Eq. (14.2.7): 


x 


+ x t [ (1 - kx' 2 ) l/2 - {1 - (1 - & Xl 2 ) 1/2 } 


(x' 


n 


(14.4.2) 


2 
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r = r A 


lei 


b 


0 


Figure 14.1 Quantities used in the calculation of parallaxes and 
apparent luminosities. The angles and the curvature of the light 
ray are greatly exaggerated. 


Here again, we use a vector notation, with 

x = (r sin 0 cos (p, r sin 6 sin (p, r cos cp ) 

and we define scalar products as in Euclidean geometry. There is no loss of general- 
ity in taking n to be a unit vector, with n 2 = 1. The parametric equation of the 
light paths, given by substituting (14.4.1) in (14.4.2), is then 


x(p) = np + x t 


(1 - kp 2 )^ 2 


{ 1 _ (1 _ kr^l^n-x,) 



(14.4.3) 


where r l = (x t 2 ) 1/2 . 

We shall now specify that the origin of the x* coordinate system is some 
definite point in the solar system , such as the center of the sun or the center of the 
200 in. mirror at Palomar, and we shall restrict ourselves to light paths that pass 
close to this origin. In this case, the unit vector n must point nearly in the — x t 
direction, so 


n ~ — Xj + £ (14.4.4) 

where x. is the unit vector x./r., and £ is a very small vector perpendicular to x. . 
(Here and below, ~ means that an equation is valid to first order in £.) Recalling 
Eq. (14.4.1), we note for future reference that (£| is the angle between the light 
path and the — direction, as measured in the coordinate system x which is 
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locally inertial at the light source. The light path, given by inserting (14.4.4) in 
(14.4.3) and discarding terms of order e 2 , is 

x(p) ~ — xjp(l — hr 1 2 ) l/2 — r t (l — kp 2 ) 1/2 ] + ep (14.4.5) 

The light path comes closest to the origin at p ~ r x . The impact parameter h is the 
proper distance of the path from the origin at this point, given by (14.2.1) and 
(14.4.5) as 

h ~ ■jR(£ 0 )|x(r 1 )| ~ (14.4.6) 


where t 0 is the time that the light ray arrives near the origin. 

Measurements of astronomical parallaxes amount to measurements of the 
direction of light paths as a function of impact parameter, which in this case is the 
projection of the earth-sun separation on the plane normal to the line of sight. 
The light path has a direction near the origin given by 


dx{p) 

d P 


(1 — kr^ 2 ) 1/2 x x 


so the line of sight is given by a unit vector in the opposite direction 

u ~ -(1 - Jr, 2 ) 1 / 2 — n = X, - (1 - kr l 2 ) 1 ^ 2 s (14.4.7) 

d P 

Hence the angle between the actual line of sight, and the line of sight that would 
be observed at the origin, is 


0 ~ |u — xj ~ (1 — ^rj 2 ) 1/2 |fi| ~ (1 — hr 2 ) 112 (14.4.8) 

R(t 0 )r 1 


In Euclidean geometry, a source at distance d would have a parallactic angle 
0 ca ujd , so in general we may define the parallax distance d p of a light source as 


, h 
d p = - 
0 


for 0 — ► 0, 6 — ► 0 


and (14.4.8) may therefore be written 


(14.4.9) 


d P 


*('o) 


(1 




h-S ) 1 ' 2 


(14.4.10) 


In a universe with h = +1, objects at r x — 1 have infinite parallax distance and 
further objects (with r 1 < 1) have decreasing parallax distances, as noted first 
in 1900 by K. Schwarzschild. 14 

In order to calculate apparent luminosities, consider a circular telescope mirror 
of radius 6, placed with its center at the origin and its normal along the line of sight 
Xj to the light source. The light rays that just graze the mirror edge form a cone at 
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the light source that, in the coordinate system x ,fl locally inertial at the source, has 
a half-angle |e| given by Eq. (14.4.6). The solid angle of this cone is 


7l\s\ 


nb 2 

R 2 {t 0 ) r i 2 


and the fraction of all isotropically emitted photons that reach the mirror is the 
ratio of this solid angle to 4?r, or 



A 

4nR 2 {t 0 )r l 2 


(14.4.11) 


where A is the proper area of the mirror 

A = nb 2 


However, each photon emitted with energy hv 1 will be red-shifted to energy 
Av 1 ^(^ 1 )/^(^o) 5 and photons emitted at time intervals <5^ will arrive at time 
intervals <5f 1 i?(f 0 )/i?(f 1 ), where as always t x is the time the light leaves the source, 
and t 0 is the time the light arrives at the mirror. Thus the total power P received 
by the mirror is the total power emitted by the source, its absolute luminosity L , 
times a factor R 2 {t i )jR 2 {t Q ), times the fraction (14.4.11): 

P = L (R 2 (h)\ ( A \ 

W(t 0 )J {inR 2 ^ 2 ) 


The apparent luminosity l is the power per unit mirror area, so 

, s P = &» 2 («i) 

A 4nR 4 (t 0 )r 1 2 


(14.4.12) 


In a Euclidean space the apparent luminosity of a source at rest at distance d 
would be Lj4nd 2 , so in general we may define the luminosity distance d L of a light 
source as 


d L ^ 


'LV ' 2 

4ti l j 


(14.4.13) 


and (14.4.12) may therefore be written 


d L — R 2 {t 0 ) 


R(h) 


(14.4.14) 


(This calculation could also have been carried out without using quantum theory, 
by applying the energy- conservation equation (7 T|tv ). v = 0 to the radiation emitted 
by the source. 1 5 ) 

Next, let us calculate the angular diameter, observed at r = 0, t = t 0 , of a 
light source of true proper diameter D at r = r 1} t = t 1 . The light rays from the 
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edge of the source travel to the origin along fixed directions xjr. Without loss of 
generality, we can rotate the coordinate system so that the center of the light 
source is at 6 — 0, and suppose that light from its edges travels to the origin on a 
cone with half-angle 6 = <5/2. (See Figure 14.2.) The proper distance across the 
source is then given by Eq. (14.2.1) as 

D = for 3 1 

so the angular diameter of the source is thus 

S = D (14.4.15) 



Figure 14.2 Quantities used in the calculation of angular 
diameters and proper motions. The angle <5 is greatly 
exaggerated. 


In Euclidean geometry, the angular diameter of a source of diameter D at a 
distance d is S — Djd, so in general we may define the angular diameter distance d A 
of a light source as 


d 


A 


D 

3 


(14.4.16) 


and (14.4.15) may therefore be written 


d A — R(t l )r 1 


(14.4.17) 
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Note that 31(1^) decreases as r x increases, so in some models d A can have a maxi- 
mum, with objects at very large distances having angular diameters that increase 
with increasing luminosity distance. 

Finally, let us consider the determination of distances from proper motions. 
A source with a true velocity V ± transverse to the line of sight will, in a time At 0 , 
move a proper distance 

AD = V, At. = V, A t 0 

Wo) 


so, by the same reasoning that led to (14.4.15), the source will appear to move an 
angular distance 


Ad 


AD 


V± At 0 
R(t 0 )r 1 


(14.4.18) 


In a Euclidean space the change in apparent position on the celestial sphere of a 
source at distance d would be V ± A t 0 /d, so we may define the proper -motion distance 
of a light source as 

d M = — (14.4.19) 

where p is the proper motion 


AS 

At Q 


(14.4.20) 


Equation (14.4.18) may therefore be written 


— Wo)ri 


(14.4.21) 


Of course, we can use (14.4.19) to measure the proper-motion distance only if we 
have some a priori knowledge of the transverse velocity, a point to which we return 
in the next section. 

The luminosity distance d L . angular diameter distance d A , and proper -motion 
distance d M of a light source with red shift z are related by the simple formulas 

d -A = Il 3h} = (1 + z)- 2 (14.4.22) 

d L R 2 (t 0 ) 

d M = Wxl = ( i +z) -i (14.4.23) 

d L R(t 0 ) 


If one can measure z accurately, there is no point in attempting a separate deter- 
mination of d L , d A , and d M , except perhaps as a check of the Robertson- Walker 
metric or of the cosmological origin of red shifts. In contrast, the measurement of 
the parallax distance d p could in principle give information beyond what could be 
learned from a measurement of d L and z, but of course at present it is only possible 



424 


14 Cosmography 


to measure parallaxes for very close objects, with z <4 1 and r 1 1. In this case 
all these observable distances become essentially equal to each other, and also to 
the proper distance (14.2.21): 


d A - d L- d M - d p ^ rfpropf^o) - R {t<s) r i (14.4.24) 


The distinction among different measures of distance only becomes important for 
objects that are billions of light years away. 

Actually, measurements of luminosity distance, angular diameter distance, 
and red shift are inextricably mixed, for at least two reasons: 


(A) Light sources such as galaxies have smooth luminosity distributions, 
without sharp edges. Let L(D ) be the absolute luminosity of that part of a light 
source within a circle (in the plane transverse to the line of sight) of diameter D. 
Then (14.4.12) and (14.4.15) give the apparent luminosity within an angular 
diameter 5 as 


, _ L(r 1 R(t 1 )3)R 2 (t l ) 

4:7iR 4 (t 0 )r 1i 2 


(14.4.25) 


It is more convenient to write this formula in terms of an absolute luminosity per 
unit transverse area 


B(D) = 


L’(D) 

2nD 


(14.4.26) 


and an apparent luminosity per unit solid angle, or brightness : 

_ m 


b(5) 


2nd 


(14.4.27) 


Using (14.4.26), (14.4.27), and (14.3.6) in (14.4.25) then gives the brightness as 

(14.4.28) 


bis , = Slr.R^S) 

4tt(1 + z) 4 


The isophotal angular diameter is the angle 3 b at which the brightness (14.4.28) 
falls below some fixed threshold value b: 


- 


D, 


r x R{t i) 

where D b is defined by the implicit equation 

B(D b ) = 4nb(l + z) 4 


(14.4.29) 


(14.4.30) 


For instance, Hubble has suggested that B(D ) is well represented for most galaxies 
by a function that near the galactic edge has the approximate form 1 6 


B(D ) ~ 


a L 

D 2 


(14.4.31) 
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where a is a dimensionless constant, ol order unity. Then (I4.4.29)-(I4.4.3I) and 
(14.4.12) give 

D b ~ ( — -Y /2 (14.4.32) 

Ujrf>(l + z)*l 


(fr 


(14.4.33) 


In this particular case, the measurement of an isophotal angular diameter is just 
tantamount to a measurement of apparent luminosity. 

(B) Most detectors of radiation respond only to photons in a narrow range of 
wavelengths. Thus it is necessary to distinguish between the bolometric luminosities 
L or l discussed above, which take account of radiation emitted or received at all 
wavelengths, and the ultraviolet, blue , photographic, visual, and infrared luminosities, 
which represent the average power or flux in various wavelength bands. If a source 
emits a radiant power L(yf) at all frequencies less than v 1; then Eqs. (14.4.12) and 
(14.3.4) give an apparent luminosity for all frequencies less than v 0 as 


7/ \ __ L^Rjt ^Rit^R 2 ^ 
° ”4t iR\t 0 )r^ 


(14.4.34) 


The frequency distributions of received and emitted power are therefore related by 
the formula : 


l'(v 0 ) = 


4^ 3 (<oK 2 


(14.4.35) 


For a black body, L'(v) is given by the Planck formula 




; 4 v \JcT 


(14.4.36) 


where T 1 is the source temperature, k is the Boltzmann constant, and h is Planck’s 
constant. The frequency distribution of received radiation is then 


l’{v 0 ) = ^L (*oX( expf^)- lV‘ 

K\\kT 0 ) V Vt 0 ) ) 


(14.4.37) 


where l is the bolometric apparent magnitude (14.4.12). and T 0 is the red-shifted 
temperature 

T ° = T > (i4 - 4 - 38) 

n(t 0 ) 


( 14 . 4 . 38 ) 


If we know the temperatures T x or T 0 , it is easy to convert the absolute luminosity 
L'lyf) Av x or the apparent luminosity l(v 0 ) Av 0 in any narrow frequency band into a 
bolometric absolute or apparent luminosity. 
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A word must be said about the time-honored language used by astronomers to 
describe astronomical distances and luminosities. The astronomical unit (abbrev- 
iated a.u.) is the mean distance of the earth from the sun 

1 a.u. = 1.49598 x 10 8 km (14.4.39) 

Treating the earth’s orbit as a circle, the projection of the earth-sun separation 
vector on the plane normal to the line of sight to any fixed star reaches a maximum 
value h max equal to 1 a.u. during the course of a year, so the position of a star 
traces out an ellipse of maximum radius n given by Eq. (14.4.9) as 

n (in radians) = — (in a.u.) (14.4.40) 

d p 

We shall call n the trigonometric parallax. One parsec (abbreviated pc) is defined as 
the distance d P at which a star would have a trigonometric parallax of 1 " ; there 
are 206,264.8 seconds in one radian, so 

1 pc = 206,264.8 a.u. = 3.0856 x 10 13 km 

= 3.2615 light years (14.4.41) 

Thus (14.4.40) may in general be expressed as 

Ti (in seconds) = — (in pc) (14.4.42) 

d P 

Only the nearest stars have measurable trigonometric parallaxes, but such is the 
power of tradition, that all astronomical distances outside our solar system are 
conventionally given in parsecs, and sometimes these distances, however measured, 
are even described in terms of an equivalent parallax. 

The apparent bolometric luminosity l is usually expressed in terms of an 
apparent bolometric magnitude w bol , or simply m, which for historical reasons is 
defined so that 

l = 10~ 2m/5 x 2.52 x 10" 5 erg/cm 2 -sec (14.4.43) 

The absolute bolometric magnitude M is defined as the apparent bolometric 
magnitude the source would have at a distance 10 pc, so 

L = 10“ 2M/5 x 3.02 x 10 35 erg/sec (14.4.44) 

Equation (14.4.13) may be expressed as a formula for the luminosity distance d L 
in terms of the distance modulus m — M: 

d L = 10 1 +(m-Af)/5 pc (14.4.45) 

The apparent magnitudes m v , m B , and so on, in the ultraviolet, blue, photographic, 
visual, and infrared wavelength bands are related to the corresponding apparent 
luminosities by formulas like (14.4.43), but with different normalization constants, 
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chosen so that all apparent magnitudes will be the same for stars of the Ao spectral 
type between fifth and sixth magnitude. The corresponding absolute magnitudes 
are defined so that the distance moduli m v ~ M Vi m B — M B , and so on, are all 
equal to m — M. (Often the ultraviolet, blue, and visual apparent magnitudes 
m v , m B) m v are denoted U, B, and F.) The color index is the quantity m B — m v = 
M b — M v ; stars with negative color index are bluer than stars with positive color 
index. For purposes of comparison, the sun has absolute magnitudes 

M (bolometric) = +4.72 M v = 5.51 M B = 5.41 M v — 4.79 
and apparent magnitudes 

to (bolometric) = —26.85 m v = —26.06 m B — —26.16 m v = —26.78 
so that its distance modulus is —31.57 and its color index is 0.62. 


5 The Cosmic Distance Ladder 

If we know the absolute luminosity L of a light source, then we can determine 
its luminosity distance d L by measuring its apparent luminosity l and using 
(14.4.13). The difficult problem is to determine L . At present, there is a ladder of 
distance determinations, with five distinct rungs, that must be climbed to get out 
to cosmologically interesting distances. (See Figure 14.3.) 


Kinematic Methods 

It is possible to measure the distance of some of the nearest stars by methods 
that do not require prior knowledge of the absolute luminosity L. One such star is 
the sun. Its distance, the astronomical unit, was first measured with tolerable 
accuracy in 1672 by Jean Richer and Giovanni Domenico Cassini. They determined 
the distance to Mars, and hence to the sun, by measuring the difference between 
the directions to Mars as seen from Paris and Cayenne, a known baseline of 6000 
miles. Of course, our knowledge of the astronomical unit has in the ensuing three 
centuries been enormously improved, most recently by the use of radar astronomy. 

A few thousand other stars are close enough so that their distances can be 
determined from the shift in their apparent positions caused by the earth’s 
revolution about the sun. We have defined the trigonometric parallax n of a star 
as the maximum angular radius of the ellipse traced out annually by the star's 
apparent motion in the sky; the star’s distance in parsecs is l/n, with % expressed 
in seconds of arc. (The adjective “trigonometric” is used here because astronomers 
have the habit of expressing stellar distances, however measured, in terms of a 
parallax, so that one encounters photometric parallaxes, moving- cluster parallaxes, 
and so on.) 
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Figure 14.3 The cosmic distance ladder. The position and height of the vertical bars 
mark roughly the range of distance over which each class of distance indicators may 
be used. 


The first star whose distance was measured in this way was 61 Cygni; in 1838 
Friedrich Wilhelm Bessel determined its trigonometric parallax as about 0.3", 
and hence its distance as about 3 pc. (Thomas Henderson had measured the 
trigonometric parallax of a Centaurus in 1832, but his calculations were not 
published until 1838.) Generally it is possible to determine stellar distances from 
trigonometric parallaxes only when n is greater than about 0.03", that is, only for 
stars closer than about 30 pc. 

In recent years it has become possible to measure the distance to some nearby 
clusters of stars by a method based ultimately on our knowledge of the speed of 
light rather than the astronomical unit. These moving clusters consist of stars 
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moving through the galaxy with equal and parallel velocities, as shown by the fact 
that their proper motions across the sky seem to converge to a common point. 
The radial velocities v r of the stars can be determined from the Doppler shifts 
Av/v of their spectra (and the known speed of light), while the velocity components 
transverse to the line of sight can be expressed as the product of the distance to the 
cluster times the proper motion (in radians per unit time) of the star across the 
sky. [See Eq. (14.4.19).] Thus observations of Doppler shifts and proper motions 
give us a complete kinematic model of the cluster, the single unknown being its 
distance. The distance can then be determined by imposing on this model the 
condition that all stars move with equal and parallel velocity. The best studied 
moving cluster is the Hyades, which contains about 100 stars within a radius of 
about 5 pc. Its distance has been measured by this “moving cluster method” as 
about 40.8 pc. 

It is sometimes possible to estimate the distances of stars, which are neither in 
moving clusters nor close enough for measurement of trigonometric parallaxes, by a 
statistical analysis of proper motions and radial velocities. Suppose that we know 
the relative distances of a sample of stars, that is, that we know the ratios dld Q , 
where d 0 is some unknown distance scale. (This would be the case, for example, if 
we knew that all stars in the sample had the same unknown absolute luminosity L , 
for then the apparent luminosities l would give us the relative distances through 
the formula d = ( L/Lnl ) 1 ^ 2 . Even if different stars in the sample have different 
absolute luminosities, measurement of their apparent luminosities will still give 
their relative distances, if we know the ratios of their absolute luminosities.) The 
transverse velocity is related to the radial velocity by 

v L = v r tan (f> 

where (j) is the unknown angle between the star’s velocity and the line of sight. 
Equation (14.4.19) can thus be written 

/x d tan cp 

v r d 0 d 0 

By measuring the quantities on the left-hand side for a large sample of stars, and 
making some reasonable guess as to the distribution in (j), it is then possible to 
deduce the unknown constant d 0 . Although this method can be used at distances 
beyond 200 pc, it is intrinsically inaccurate, and can be thrown off badly if the 
sample of stars studied does not have the assumed distribution in <fi. 

It hardly needs to be mentioned that all of the above kinematic distance 
measurements can be used only for stars within our galaxy, where cosmological 
effects are surely negligible. Thus they can be regarded as determinations of the 
luminosity distance d L , or the proper distance d prop . or what you will. (It has 
occasionally been proposed that trigonometric parallaxes might be measured out 
to distances of order 10 8 pc by interferometric radio observations, using as baseline 
the distance from the earth to an artificial satellite in orbit around the sun. If this 
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could be accomplished, then the problems of cosmography could be solved by 
determination of trigonometric parallax as a function of red shift.) 


Main- Sequence Photometry ( < 10 5 pc) 


Once we know the distance of a star by one of the above kinematic methods, 
we can determine its absolute luminosity L by measuring its apparent luminosity l 
and using the formula L = 47 zd 2 l. In this way it was independently discovered by 
Ejnar Hertzsprung and Henry Norris Russell during the decade 1905-1915 that a 
large proportion of nearby stars, the main sequence, obey a rather strict relation 
between absolute luminosity and spectral type. (The spectral type, which is 
actually a measure of surface temperature, is usually denoted by one of the letters 
0, B , A, F , G, K, M, R, N, S, with 0 very hot and S comparatively cold. (See 
Figure 14.4.) The canonical mnemonic is “Oh be a fine girl, kiss me right now sweet- 
heart!”) Astrophysical theory 17 explains the main sequence as a rather long 
initial phase in the thermonuclear evolution of almost all stars. 


Type 



Star 

X Cephei 
r\ Aurigae 

S Cygni 
0 Cassiopeiae 

V Pegasi 
7 Draconis 
a Herculis 

19 Piscium 

R Geminorum 


Figure 14.4 Spectra of stars belonging to various spectral classes. (Courtesy Mt. 
Wilson and Mt. Palomar observatories.) 


Given the Hertzsprung-Russell relation between absolute luminosity and 
spectral type, it became possible for astronomers to determine the distance to any 
main-sequence star whose spectral type and apparent luminosity could be 
measured. The method works best when applied to a cluster of stars, all of which are 
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about at the same distance from the earth, for then the main sequence can be 
picked out by plotting apparent luminosity versus spectral type for a large sample 
of cluster stars. It also works well only for the lower part of the main sequence, 
where the Hertz sprung -Russell relation is best known. 

The catalogued clusters of our galaxy are divided between some 650 open 
clusters , like the Hyades and Pleiades, each containing 20 to 1000 stars, and some 
130 globular dusters , such as the great cluster Ml 3 in Hercules, each containing 
10 5 to 10 7 stars. (See Figures 14.5 and 14.6.) In determining the distance of these 



Figure 14.5 The open cluster NGC2682, in Cancer; photographed with the 200-in. 
telescope at Mt. Palomar. (Courtesy Mt. Wilson and Mt. Palomar observatories.) 


clusters it is important to recognize a distinction between their stellar populations, 
first pointed out by Walter Baade 18 in 1944. (See Figure 14.7.) Stars in open 
clusters, as well as most nearby stars like the sun, generally belong to Population I. 
which is characterized by high metal content and relative youth, and is limited in 
our galaxy to the spiral arms. Stars in globular clusters belong to Population II. 
which is characterized by lower metal content and greater age, and pervades the 
whole galaxy. There are differences between the main sequences of Populations I 
and II, so the use of a main sequence, calibrated from nearby stars, to determine 


Figure 14.6 The globular cluster NGC6205 (M13) in Hercules; photographed with the 
200-in. telescope at Mt. Palomar. (Courtesy Mt. Wilson and Mt. Palomar observatories.) 



Figure 14.7 Examples of stellar populations I and II; photographed with the 200-in. 
telescope at Mt. Palomar. On the left are Pop. I stars in the spiral arms of M31; on 
the right Pop. II stars in the M31 satellite NGC 205. (Courtesy Mt. Wilson and Mt. 
Palomar observatories.) 
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the distance of the globular clusters is beset with technical complications, which 
need not concern us here. 

The method of determining distances by main-sequence photometry is limited 
because typical main-sequence stars are not particularly bright. For instance, the 
Hale reflector at Palomar has difficulty in resolving stars that are fainter than 
m — 22.7, so it can resolve a star with the absolute magnitude M — 4.7 of the sun 
only out to a distance modulus m — M — 18, which according to Eq. (14.4.45) 
corresponds to a distance of 40,000 pc. 

At present, it is primarily the stars of the Hyades that are used to calibrate the 
Hertzsprung-Russell relation, so the whole scale of galactic and extragalactic 
distances rests on our knowledge of the distance to the Hyades, as determined by 
the “moving- cluster method” discussed above. Recently Hodge and Wallerstein 19 
have noted that both the mean trigonometric parallax of the Hyades stars, and the 
comparison of their apparent magnitudes with the Hertzsprung-Russell relation 
obtained from other nearby stars, suggest that the distance to the Hyades may be 
about 50 pc rather than 40.8 pc. If this is so, then all galactic and extragalactic 
distances must be increased by about 20%. 

Variable Stars (<4 x 10 6 pc) 

There are about 10,000 catalogued stars whose apparent luminosity is ob- 
served to vary more or less regularly with time. In setting the extragalactic distance 
scale, an important role is presently played by two families of variable stars, the 
cluster variables, or RR Lyrae stars, and the classical Cepheids, or S Cephei stars. 
The RR Lyrae stars have periods ranging from a few hours to a day, and belong to 
Population II, while the classical Cepheids have periods ranging from 2 to 40 days, 
and belong to Population I. (In addition there is another family of variable stars, 
the W Virginis stars , which belong to Population II, but have long periods like the 
classical Cepheids. As we shall see, W Virginis stars were confused with Cepheids 
before Baade distinguished the two stellar populations.) 

The absolute magnitudes of the RR Lyrae stars are presently best known both 
through direct statistical studies of proper motion and parallax and through their 
presence in globular clusters, whose distance can be determined by main-sequence 
photometry. In this way it has been found 20 that the RR Lyrae stars all have 
roughly the same absolute magnitude, somewhere between M v ~ 0.2 and 
M v ~ 1.0. Thus, once we recognize an RR Lyrae star by its short-period pulsation, 
we can estimate its distance from its apparent magnitude. However, RR Lyrae 
stars are not bright enough to be used at distances beyond about 3 x 10 3 pc. 
For this reason, much more attention has been devoted to the brighter classical 
Cepheids. 

Unfortunately, the classical Cepheids differ widely in absolute luminosity. 
However, in 1912 it was noted by Henrietta Swan Leavitt 21 that the 25 classical 
Cepheids then known in the smaller Magellenic cloud have apparent luminosities 
given by a smooth function t SMC (^) of the period P. (Roughly, l oz P. ) The stars 
of this cloud are all at about the same distance from the earth, so Leavitt could 



Figure 14.8 The great galaxy M31 (XGC 224) in Andromeda, with satellite galaxies 
XGC205 and 221. This photograph was taken with the 48-in. Schmidt telescope at 
Mt. Palomar. (Courtesy Mt. Wilson and Mt. Palomar observatories.) 
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conclude that the absolute luminosity of a classical Cepheid of period P is a smooth 
function L(P), proportional to l s MC {P). However, she did not know the distance 
to the smaller Magellenic cloud, and there are no Cepheids near enough to the earth 
to have measurable trigonometric parallax, so Leavitt could not determine the 
constant of proportionality. 

The laborious task of calibrating the Cepheid P-L relation was carried out 
between the two world wars, first by Russell 22 and Hertzsprung, 2 3 then by Harlow 
Shapley, 24 and eventually by Ralph E. Wilson. 25 They did not then make use of 
main-sequence photometry; rather, the main tool was the statistical analysis of 
proper motions and radial velocities for the Cepheids nearest the sun, as described 
above under “Kinematic Methods,” with ratios of Cepheid absolute luminosities 
provided by the P-l relation of the Magellenic clouds. Cepheids were discovered in 
the great nebula M31 in Andromeda by Edwin Hubble 26 in 1923, and their 
observed periods and apparent luminosities were used, together with the Cepheid 
P-L relation, to estimate the distance of M31 as 280,000 pc. (See Figures 14.8 and 
14.9.) Tt was this measurement that definitely established the status of the “spiral 
nebulae” as islands of stars comparable with our own galaxy, as suggested by 
Immanuel Kant, rather than mere clouds or clusters within our galaxy. Taking 




Figure 14.9 Variable stars in a portion of M31. Two of the variables are marked. This 
photograph was taken with the 200-in. telescope at Mt. Palomar. (Courtesy Mt. Wilson 
and Mt. Palomar observatories.) 
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interstellar absorption into account later lowered this figure to 230,000 pc, but 
otherwise the extragalactic distance scale remained essentially unchanged until 
Palomar started operations in 1950. 

By 1952 it had become apparent that something was seriously wrong with the 
Cepheid P-L relation of Shapley et al. Photographs of M31, taken with a 30 minute 
exposure at Palomar, showed only the brightest stars of Population II, but no 
BR Lyrae variables. This indicated that the brightest Population II stars of M31 
had apparent photographic magnitude m ~ 22.4, and since the RR Lyrae stars 
were known to have an absolute luminosity about four times fainter, their apparent 
magnitudes in M31 would have to be about m ~ 23.9, beyond the reach of Palo- 
mar. However, the absolute magnitude of the RR Lyrae stars was by then 
reasonably well known through the photometric determination by Allan Sandage 
of the distance of the globular cluster M3. If M31 were really at a distance of 
230,000 pc, then the RR Lyrae stars ought to have shown up at m ~ 22.4 at least 
in their maximum phase, and the brightest stars of Population II ought to have had 
apparent magnitude m c=l 20.9, rather than m ~ 22.4. This discrepancy was 
interpreted by Baade 27 to mean that M31 was not at 230,000 pc, but at a distance 
about twice as great (a difference of 1.5 apparent magnitudes corresponds to a 
factor of 2 in distance), so that the classical Cepheids in its spiral arms had to be 
about four times brighter than had been estimated. 

The source of this error is somewhat obscure. The P-L relation, as calibrated 
by Shapley et al., actually works rather well for the Population II W Virginis 
variables, but fails for the Population I classical Cepheids, which are generally 
four times brighter than W Virginis stars of the same period. However, it should not 
be thought that Shapley, not knowing of the distinction between stellar popula- 
tions, had based his calibration on W Virginis stars rather than classical Cepheids. 
Indeed, the 11 variables considered by Shapley 24 in 1918 were Population I stars, 
and even included the eponymous classical Cepheid, <5 Cephei! (In any case, 
W Virginis stars are both less luminous than classical Cepheids, and rarer near the 
sun, so it would have been remarkable if they had played a large part in the 
statistical proper- motion studies used to calibrate the Cepheids.) A recent re- 
analysis 28 of the same 11 classical Cepheids used in Shapley ’s calibration reveals 
that Shapley’s calibration contained errors of about 0.7 magnitudes due to the 
neglect of interstellar absorption, 0.6 magnitudes due to systematic errors in 
proper motions, and 0.1 or 0.2 magnitudes due to galactic rotation, which intro- 
duces anisotropies in the distribution of stellar velocities. All these errors tended in 
the same direction, and led to the famous 1.5 magnitude underestimate in Cepheid 
absolute luminosities discovered by Baade in 1952. Thus it was a pure coincidence 
that Shapley’s original P-L curve, though not valid for the classical Cepheids of 
Population I, was actually valid for the W Virginis stars 29 of Population II. 

One may also ask why Shapley’ s calibration was repeatedly confirmed during 
the third of a century from 1918 to 1952. One simple reason is that interstellar 
absorption was persistently underestimated. Thus, when Ralph Wilson 2 5 attempted 
to improve the statistical analysis of proper motions and radial velocities by using 
more Cepheids, 74 in 1923 and 157 in 1939, he had to include more and more 
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distant stars, so that Iris improvement in statistical accuracy was cancelled by the 
increasing effects of absorption. Another reason does have to do with the confusion 
of populations. Shapley 24 had immediately applied his P-L relation to what he 
thought were classical Cepheids in the globular clusters co Cen, M3, and M5, and in 
this way he was able to determine the distance to these globular clusters, and 
thereby to calculate the absolute magnitudes of their short-period variables, the 
RR Lyrae stars. This procedure actually gave the right answer, because the stars 
in the globular clusters that Shapley thought were classical Cepheids really were 
W Virginis stars, and Shapley’s P-L calibration, though seriously in error for the 
classical Cepheids from which it had been derived, was accidentally more or less 
correct for the W Virginis stars! Hence, when statistical studies of the proper 
motions and radial velocities of nearly RR Lyrae stars were carried out a few years 
later, they tended to confirm Shapley’s estimate of the absolute magnitudes of 
the RR Lyrae stars, and this naturally appeared as a confirmation of the Cepheid 
P-L relation. The argument was then turned around : Taking the ratio of RR Lyrae 
to “Cepheid” luminosities from the globular clusters, Wilson included 10 RR 
Lyrae stars along with the 74 Cepheids in his 1923 analysis of proper motions and 
radial velocities, and in 1939 he included 67 RR Lyrae stars along with 157 
Cepheids. Oddly enough, the RR Lyrae stars did not, like the Cepheids, introduce 
large errors through the neglect of absorption, because RR Lyrae stars, being of 
Population II, are mostly found outside the plane of the galaxy. Rather, the 
trouble was that the classical Cepheids do not, like the W Virginis stars, fall on a 
P-L curve that extrapolates smoothly down to the RR Lyrae stars, but instead 
lie 1.4 magnitudes higher. 

It should be noted that Baade’s 1952 revision of the Cepheid P-L relation 
doubled the extragalactic distance scale, but did not affect the estimated size of 
our own galaxy, because the galactic distance scale was determined from the 
distances of the globular clusters, which, as we have seen, had partly by accident 
been determined correctly. Before 1952, it appeared that all neighboring galaxies 
were distinctly smaller than our own. After 1952, it was clear that many other 
galaxies are as large or larger than our own, a highly satisfactory if sobering 
result. 

The calibration of the classical Cepheids has since been put on a much firmer 
footing by the discovery of five classical Cepheids in the galactic open clusters 
NGC6087, NGC129, M25, NGC7790, and NGC6664, together with four more 
classical Cepheids in the “association” h + y Persei. The distance to these clusters 
is known through the photometric study of their main-sequence stars by Kraft, 30 
and these nine Cepheids of known absolute magnitude have been used by Kraft, 
and more recently by Sandage and Tammann, 31 to fix the absolute scale of the 
Cepheid P-L relation. (The form of this relation which, to be precise, actually 
relates period, luminosity, and color, is of course determined by a much larger 
sample of Cepheids, taken from both Magellenic clouds, the Andromeda galaxy 
M31, and the small Fornax galaxy NGC6822.) At present the distance of M31, as 
determined by its classical Cepheids, is given as 700,000 pc, about three times 
greater than the distance accepted during the 1930’s. 
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The RR Lyrae stars and classical Cepheids can be used to determine the 
distances for all members of the association of nearby galaxies and stellar systems 
known as the local group. Of these, it is only the closest objects, such as the Magel- 
lenic clouds and the Ursa Minor, Draco, and Sculptor systems, in which RR 
Lyrae stars can be employed. For all the major galaxies of the local group, 
such as M31 and M33, it is necessary to use the classical Cepheid P-L relation, as 
calibrated by the nine known Cepheids in the open clusters and associations of our 
galaxies. The classical Cepheids are bright enough at maximum light (M max ~ 
— 5.3) to be used at distances of order 4 x 10 6 pc, which is far enough to reach 
some galaxies outside our own local group, such as the beautiful spiral M81. 
However, the Cepheids are not bright enough to be used to determine the distance 
of the nearest cluster of galaxies, the Virgo cluster. 

Novae. H II Regions. Brightest Stars. Globular Clusters, and so on (< 3 x 10 7 pc) 

The next rung may at present be the weakest. 32 In order to estimate the 
distances of objects far outside our local group of galaxies, it is necessary to find 
some distance indicators that are brighter than the Cepheids, but are present in 
sufficient numbers in our local group of galaxies (whose distances are known via 
the Cepheids) to permit an accurate calibration of their properties. 

Novae are sudden increases by four to six orders of magnitude in the luminosity 
of a star, and occur in typical galaxies at a rate of 40 per year. They have been used 
as distance indicators since 1917, when a nova was found in the spiral nebula 
NGC6946. The brightest novae reach M v ~ —7.5, so they can in principle be used 
as distance indicators out to about 10 7 pc, but they tend to occur in the bright 
central regions of galaxies, and are therefore difficult to resolve. 

Until recently, the primary distance indicators used to reach beyond our local 
group were the brightest stars of galaxies. A survey of the local group reveals that 
the stars of each galaxy generally have a well-defined maximum absolute luminosity, 
about M v ~ —9.3. They can therefore be used as distance indicators out to about 
3 x 10 7 pc, but at distances beyond 10 7 pc, it is difficult to distinguish between 
brightest stars and nonstellar objects, such as associations or emission regions. 
(Indeed, it is believed that Hubble’s 1936 calibration 3 3 of the distance scale was in 
error, partly because he confused such objects with brightest stars.) 

It is also possible to use certain nonstellar objects as distance indicators. 
Among these are the H II regions , large clouds of interstellar hydrogen that are 
ionized and made luminous by the presence of 0 and B stars. They are hundreds of 
parsecs in diameter, so their angular diameters might be used to estimate their 
distances out to about 10 8 pc. 

Recently Sandage 34 has developed the use of globular clusters as a distance 
indicator that may prove more reliable than any of the above. The hundreds of 
globular clusters in our galaxy have absolute magnitudes M v that are typically 
about — 8, but vary widely about this mean. However, the study 3 5 of 2000 globular 
clusters in the large E (elliptical) galaxy M87 of the Virgo cluster (see Figure 14.10) 
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Figure 14.10 The giant elliptical galaxy M87 (NGC4486) in the Virgo cluster, for four 
different directions of polarization. These photographs, taken with the 200-in. tele- 
scope at Mt. Palomar, are underexposed in order to reveal the galactic nucleus and 
the remarkable “jet.’’ (Courtesy Mt. Wilson and Mt. Palomar observatories.) 


reveals a sharp cutoff in their luminosity distribution, with m 5 (max) ~ 21.3. 
Sandage suggests that the absolute magnitude of the brightest globular clusters in 
M87 be identified with the absolute magnitude of the brightest globular cluster 
B282 of the Andromeda galaxy M31, which is known to have absolute magnitude 
M b ( B282) ~ —9.83. The distance modulus of M87 is thus 21.3 minus — 9.8, or 
31.1, giving a distance for M87, and hence for the Virgo cluster, of 1.7 x 10 7 pc. 
Of course it is not definitely known that the luminosity distribution of the globular 
clusters has a sharp cutoff rather than a smooth tail at high luminosities. De 
Vaucouleurs 36 has examined the latter possibility, and concludes that the Virgo 
cluster is at a distance of 2 x 10 7 pc, 20% further than calculated by Sandage. 


Brightest Galaxies (< 10 1U pc) 

The Virgo cluster has a small mean red shift, z = 0.0038, corresponding to a 
radial velocity of about 1100 km/sec. This is not much larger than the mean 
random velocity of typical galaxies, and it is only when we get out beyond the 
Virgo cluster that the cosmological expansion dominates the velocity field. In 
order to get out to these cosmologically interesting distances, it is generally 
necessary to use whole galaxies as distance indicators. 
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Clusters of galaxies contain hundreds to thousands of distinct galaxies (the 
Virgo cluster contains 2500), so if there is any natural upper limit to the absolute 
luminosity of individual galaxies, the absolute luminosity of the brightest galaxy in 
a rich cluster ought to be near that maximum. For this reason Edwin Hubble 33 
in 1936 suggested the use of the brightest galaxies in clusters as distance indicators. 
(He actually used the fifth brightest, to minimize observational errors.) This 
procedure was validated when it was found that the use of brightest galaxies as 
distance indicators gave a good linear relation between luminosity distance and z 
for 10 clusters with z 1. (See the next section.) These brightest cluster galaxies 
are usually the elliptical galaxies known as type E in Hubble’s classification scheme. 
(See Appendix.) 

According to Sandage, 34 the brightest E galaxy in the Virgo cluster is 
NGC4472, with absolute magnitude M B ~ —21.68, determined by using the 
globular clusters in M87 to give the distance to the Virgo cluster. If all brightest E 
galaxies have absolute magnitude M B ~ — 21.7, then they can be used as distance 
indicators out to a distance modulus m, — M of about 44.5, or a luminosity 
distance of about 10 10 pc. 

However, it is possible that there is no sharp cutoff to the luminosity distri- 
bution function of galaxies in clusters. In this case, the use of brightest galaxies as 
distance indicators would be complicated by the Scott effect , 37 first discussed in 
this connection by Elizabeth L. Scott. As we look out to greater and greater 
distances, we tend to select increasingly rich clusters of galaxies for study, and if 
there is no absolute upper limit to galactic luminosity, the brightest galaxies in 
these clusters will have greater and greater absolute luminosity. If we mistakenly 
assume that these distant galaxies have the same absolute luminosity as NGC4472, 
we will underestimate their true luminosity distance. The existence or nonexistence 
of a Scott effect is still a matter of controversy. 38 Other problems affecting the use 
of brightest galaxies as distance indicators are discussed in the next section. 

One need only summarize the rungs of the cosmic distance ladder to see how 
shaky it is. At the time of writing, the distance to the Hyades is determined by 
observation of its stars’ proper motions and radial velocities; the distance to five 
open galactic clusters and the h -f y Persei association is determined by photo- 
metry of their main-sequence stars, whose absolute magnitude is known by study 
of the Hyades; the distance to the Andromeda nebula M31 is determined from 
observation of classical Cepheids, whose P-L relation is calibrated using the nine 
known Cepheids in open clusters and the h + % Persei association ; the distance to 
the Virgo cluster is determined by assuming that the brightest globular cluster in 
M87 has the same absolute luminosity as the brightest globular cluster B282 of 
M31; and the distances to more distant clusters of galaxies are determined by 
assuming that their brightest E galaxies have the same absolute luminosity as the 
brightest galaxy NGC4472 in the Virgo cluster. It is entirely possible that new 
errors may be found at any rung of the ladder, in which case adjustments would 
have to be made at all higher rungs. 
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t> The Tied- Shift Versus Distance delation 


We shall now consider how the correlation of red shifts with distances can be 
used to gain information about the cosmic scale factor B(t). For our present pur- 
poses, only luminosity distances will be considered; Eqs. (14.4.22) and (14.4.23) 
show that no new information can be gained by studying the correlation of red 
shifts with angular diameters or proper motions instead of apparent luminosities. 

Suppose then that astronomers are able to define some family of objects, like 
the brightest E galaxies discussed at the end of the last section, whose absolute 
luminosities L are all known. By measuring their apparent luminosities, their 
luminosity distances can be calculated from Eq. (14.4.13): 


d L 


'L\l 2 
4=nl ) 


Suppose also that the red shifts 2 of these objects are measured, so that an 
empirical curve for d L (z) is known. What does this tell us about R(t) ? 

The observables d L and 2 are related to the unknown coordinates of the light 
source by the theoretical relations (14.3.1), (14.3.6), and (14.4.14): 



'Vi 

[1 — kr 2 ]~ 1/2 dr 
0 


R{tp) __ j 

Bit,) 

= r ‘ ?,77 = r ^ )(l + *> 

B\t 1 ) 


At present, the curve d L (z) is tolerably well known only for small 2 , so we are 
primarily concerned with the case where t 0 — t t and r t are small. The cosmic 
scale factor B(t) may then be most usefully expressed as a power series 

m = J?(«o)[l + H 0 (t - t 0 ) - i q 0 H 2 0 (t - t 0 ) 2 + • • ■] (14.6.1) 


where t 0 is the present moment, and H 0 and q 0 are parameters known as Hubble's 
constant and the deacceleration parameter 

H = (14.6.2) 

R(t 0 ) 

1o = -R(to) (14-6.3) 

K (t 0 ) 


(Dots denote derivatives with respect to time.) We shall see in the next chapter 
that the whole function B(t) can be calculated by using Einstein’s field equations if 
we know the values of H 0 and q 0 , with Ic > 0 if q 0 > j and k < 0 if q 0 < j. Our 



442 


14 Cosmography 


present discussion will therefore be directed to the measurement of these two 
critical parameters. 

The use of Eq. (14.6.1) in (14.3.6) yields a power series for the red shift as a 
function of the time of flight t 0 — t x : 


z = H 0 (t 0 — t x ) + [ 1 + “ ) H 2 0 (t 0 — t x ) 2 + 


(14.6.4) 


Inverting this power series, we obtain a formula for the time of flight in terms of 
the red shift : 
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To find jq, we expand (14.3.1) : 
1 

*(*<>) Jo 

so 


1 + H 0 (t 0 — 0 + [ 1 + ~ ) H*(t 0 “ 0 2 + 

Z 


• • ■ "j dt — r 


r, = 


1 

R(t 0 ) 


Po — h + ?^o(^o ” h ) 2 + ' * *] 


Using (14.6.5) in (14.6.6) gives r 1 in terms of the red shift: 

Tl = W\H [Z ~ i(1 + q ° )z2 + "' ] 

and (14.4.14) then gives the luminosity distance as a power series: 

d L = H 0 ‘[2 + i(l - q 0 )z 2 + • • •] 

This can also be written as a formula for apparent luminosity : 

L 


l = 


4rTtd-, 2 


L 4;r 2 

or equivalently, for the distance modulus: 


LH ° 2 [1 + (<?0 - 1)2 + ■••] 


(14.6.5) 


i + 0(r x 3 ) 


(14.6.6) 


(14.6.7) 


(14.6.8) 


(14.6.9) 


m — M = 25 — 5 ln 10 H 0 (km/sec/Mpc) -f 5 ln 10 cz (km/sec) 

+ 1.086(1 - q 0 )z + • ■ • (14.6.10) 

[One Mpc is 10 6 pc. Note that 100 km sec/Mpc equals (9.78 x 10 9 years) - 1 .] The 
program is then to compare either (14.6.8), (14.6.9), or (14.6.10) with astronomical 
data, and thereby to determine the critical parameters q 0 and H 0 . 

In order to measure q 0 , we need to go out to large values of 2 (say, 2 >0.1), 
where only brightest cluster galaxies (or possibly supernovae) can be used as 
distance indicators. However, it is only the shape of the curve of d L or l or m versus 
2 that is needed. 
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In order to measure H 0 , a single object with z < 0.1 is all we need, but we 
have to know its absolute luminosity as well as its red shift and apparent lumino- 
sity. Also, its red shift must be large enough (say, z > 0.01) so that its radial 
velocity reflects the general expansion of the universe, rather than a local velocity 
anomaly. Unfortunately, the Virgo cluster, whose luminosity is known from 
observation of its brightest stars and globular clusters (as in “rung 4” of the 
cosmic distance ladder described in the last section), has a radial velocity of only 
about 1000 km/sec, which is not large enough to ensure the dominance of the cos- 
mological recession. There is a possibility that “rung 4” can be employed out to 
larger red shifts, for example, by using the angular diameters of H II regions. 
However, at present the only way to extend measurements of H 0 as well as q 0 out 
to large red shifts is to use all five rungs of the cosmic distance ladder, taking as 
distance indicators the brightest galaxies of rich clusters. 

This program is subject to a great many complications. Some, which are now 
taken into account by applying well-understood corrections to the data, include: 

(A) Galactic Rotation. The rotation of our galaxy gives the sun a velocity of 
about 215 km/sec. This produces systematic red or blue shifts in the spectra of 
distant galaxies, which are routinely subtracted from the observed red shifts in 
calculating the “cosmological” red shift z. 

(B) Aperture. Since the edges of galaxies fade gradually into the background 
light of the sky, it is necessary to refer all measurements of galactic apparent 
luminosities to a standard telescope aperture. 

(C) k-Term. As discussed in Section 14.4, the red shift will distort the 
frequency distribution of light from distant objects, so that their visual or blue 
magnitudes reflect their absolute luminosities at higher frequencies than for near 
objects. If we know the intrinsic frequency distribution, we can correct for this 
effect by using Eq. (14.4.35), with the result that the left-hand side of Eq. (14.6.10) 
is replaced with m B — M B — k B (z), where k B (z) is an explicitly known function of 

calculated by Oke and Sandage. 39 In an alternative procedure developed by 
Baum, 40 the luminosity distribution is measured directly for each galaxy studied, 
so that all apparent magnitudes can be referred to the same emission frequency, 
and no Uterm is needed. 

(D) Absorption. Our galaxy is known to absorb a certain fraction of the light 
coming to us from extragalactic objects. Treating our galaxy as an infinite flat 
slab, the distance through the galaxy that a light ray must travel on its way to us 
is proportional to cosec h, where b is the angle between the line of sight and the 
plane of the galaxy. The light will therefore be dimmed by a factor exp ( — X cosec b), 
with X some constant. Taking X from studies of the nearer extragalactic objects, the 
result is then that the left-hand side of Eq. (14.6.10) is replaced with the corrected 
distance modulus : 


(m - M) co „ = m B - Mg - k B (z) - A B (b ) 
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where, roughly, 

A B {h) ~ 0.25 cosec h 

(This is somewhat of an oversimplification. Sandage 41 first applies an absorption 
correction A v {h) = 0.18 (cosec b — 1) to the visual magnitudes, then converts to 
blue magnitudes, and then applies an additional correction A B — 0.25.) No 
correction is generally made for extragalactic absorption. 42 (See Section 15.4.) 

In addition to the above complications, which are pretty well understood, 
there are a number of other possible sources of error, whose status is much more in 
doubt. 


(E) Uncertainty in L. As emphasized in Section 14.5, a new correction at any 
rung of the cosmic distance ladder, such as a change in the distance to the Hyades or 
the Virgo cluster, would require a corresponding correction to the estimated absolute 
luminosity of the brightest E galaxies. Inspection of Eq. (14.6.9) or (14.6.10) shows 
that this would affect the value of Hubble’s constant, but would not change the 
deacceleration parameter g 0 . 

(F) Scott Effect. It was also emphasized in the last section that, if there is 
no sharp upper limit to the absolute luminosity of cluster galaxies, then the 
tendency to select richer clusters at greater distances would mean that the absolute 
luminosities of their brightest galaxies would increase with z. According to Eq. 
(14.6.9), this selection effect would lead to an overestimate of the deacceleration 
parameter q 0 . However, the Scott effect, if real, would only enter at very great 
distances, and therefore would have little effect on the value of II 0 . 

(G) Shear Field. He Vaucouleurs 43 has suggested the existence of a local 
anisotropy in the galactic velocity field, encompassing our own local group and the 
Virgo cluster. If this is true, it could mean that red shifts with cz less than about 
4000 km/sec do not accurately reflect the general expansion of the universe. 


(H) Galactic Evolution. As we look further and further out into space, we 
see galaxies that are presumably younger and younger. It may be that the 
luminosity of the brightest E galaxies is a function HtJ of the time the light was 
emitted. Equation (14.6.5) tells us then that L in Eq. (14.6.9) should be replaced 
with 

Wi) = L(t 0 )[\ - E 0 (t 0 -«,)+•••] 


where 


L(t 0 ) 


1 - 


EgZ 

Ho 



= Wo) 
~ Wo) 


(14.6.11) 


The effect would be that q 0 in Eq. (14.6.9) would be replaced with an effective 
deacceleration parameter 

E n 


qf = q 0 


Hr 


(14.6.12) 
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so that astronomical observations would really measure q™, not q 0 . Sandage has 
recently given two different estimates for the rate of change of L for brightest E 
galaxies — in our notation, they are 44 

E 0 — 0.04 + 0.02/10 9 years (14.6.13) 

and 45 

E 0 = 0.00 + 0.05/ 10 9 years (14.6.14) 

As we shall see, a rate of evolution E 0 of order 0.04/ 10 9 years would have an 
important effect on the value of ^g ff - 

Returning now to the observations of red shifts and luminosity distances, we 
must pick up our story where we left it in Section 14.3, with Hubble’s 1929 
discovery 13 of a linear relation between d L and z. Hubble estimated the distance 
to 18 nearby galaxies from the apparent magnitudes of their brightest stars, and 
plotted the results against Slipher’s red shifts for these objects. (The absolute 
magnitude of the brightest stars was known from studies of our local group of 
galaxies, whose distance was known from observation of their Cepheid variables, 
whose P-L curve had been calibrated by Shapley from statistical studies of proper 
motions and radial velocities. For details, see the last section.) The most distant 
galaxies used by Hubble were members of the Virgo cluster, with a radial velocity 
of 1000 km/sec. This is not much greater than the r.m.s. random galactic velocity, 
and Hubble’s data points were consequently spread all over the d L versus 2 plot. 
Nevertheless, he was somehow able to deduce a “roughly linear” relation between 
cz and d L , with slope 

H 0 ~ 500 km/sec/Mpc ~ [2 x 10 9 years] -1 

At this very time, Milton L. Humason was beginning a program of red -shift 
measurements at much greater distances, using the 100 in. reflector at Mount 
Wilson to study the brightest galaxies in clusters. His first definite result, a radial 
velocity cz = 3779 km/sec for the galaxy NGC7619, was used by Hubble in his 
1929 paper 13 to check the linearity of the relation between cz and d L . Assuming 
this relation to be linear, with slope 500 km/sec/Mpc, Hubble could deduce a 
distance 7.8 Mpc for NGC7619, so that its apparent magnitude m = 11.8 implied 
an absolute magnitude M — — 17.65. Hubble also calculated the absolute magni- 
tudes of the 18 galaxies used in his determination of H 0 (plus six additional 
members of our local group) from their distances and apparent magnitudes, and 
found M to range from — 12. 7 to — 17.7. Since NGC7619, as the brightest galaxy in 
a cluster, is presumably brighter than average, this could be considered reasonably 
good agreement, indicating that cz is indeed roughly proportional to d L out to 
z ~ 0.013. 

Continuing their collaboration, Hubble and Humason 46 by 1931 had verified 
the linearity of the relation between ci and d L out to 20,000 km /sec (z = 0.067), 
and H 0 was revised to 550 km/sec/Mpc. The limit of the Mount Wilson telescope 
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was reached in 1936, when Humason 47 recorded the radial velocity of the Ursa 
Major II cluster as 42,000 km/sec (z = 0.14), By plotting ln 10 2 against the apparent 
photographic magnitude (with absorption and Uterm corrections) of the fifth 
brightest galaxy in 10 clusters, ranging from Virgo to UMall, Hubble 48 could 
verify that the slope was close to y, as would be expected [see Eq. (14.6.10)] if the 
red shift is a linear function of luminosity distance out to z ~ 0.14. A definite 
measurement of q 0 had to wait until completion of the 200-in. reflector at Mount 
Palomar. 

Hubble also at this time 48 gave a new estimate of H 0 , using 109 field galaxies, 
with cz ranging up to 19,070 km/sec, as the fifth rung of the cosmic distance ladder. 
The absolute magnitudes of these field galaxies were assumed to be the same as the 
average, M — —15.18, for 145 resolved galaxies (only 29 of which belonged to 
the sample of 109 field galaxies), whose distances could be determined from the 
apparent magnitude of their brightest stars. Plotting m — M against lnio cz gave 
H o = 520 km/sec/Mpc. A separate determination, based directly on the brightest 
stars in 29 resolved field galaxies, gave H o = 526 km/sec/Mpc. 

In 1950, Palomar was ready, and Hubble’s program was taken up again. As 
we saw in the last section, the first consequence of the observations at Palomar was 
the recalibration by Baade 27 of the Cepheid period-luminosity relation. This 
immediately doubled the extragalactic distance scale, and hence halved the value 
of Hubble’s constant, to about 260 km/sec/Mpc. In 1956, Humason. Mayall, and 
Sandage 49 published an exhaustive survey of the information then available on 
red shifts and distances. Under the assumption that the brightest galaxies in 
clusters have the same absolute magnitude as M31, the intercept of the plot of 
m v — k v — A v against ln 10 cz for the brightest galaxies in 18 clusters (out to 
2 = 0.18) gave H 0 = 180 km/sec/Mpc. (A separate determination, based in the 
average red shift of the Virgo cluster and the apparent magnitudes of the brightest 
stars in the Virgo cluster galaxy XGC4321, gave 7/ 0 - 176 km/sec/Mpc.) Also, 

with no evolutionary correction, the curvature of the graph of m v — k v — A v 
versus ln 10 2 gave 

q 0 = 3.7 + 0.8 

A year later, Baum 40 reported a study of eight clusters, using eight-color photo- 
metry to avoid the need for a k- term correction. His result was 

= 1 i i 

Next, Sandage 50 reexamined Hubble’s use of brightest stars at “rung 4” of the 
distance ladder, and in 1958 concluded that some of these “brightest”' stars are 
H II regions, which are 1.8 magnitudes brighter than the true brightest stars. The 
cosmic distance scale expanded again, and H 0 dropped to 75 km/sec/Mpc. Further 
analysis led Sandage 51 to give a value for H 0 of 98 km/sec/Mpc in 1961. Also, 
preliminary calculations of galactic evolution led Sandage 52 to estimate that 
galaxies are decreasing in luminosity, with E 0 ~ — 0.8 H 0 , so that Baum’s value of 
led to q 0 = 0.2 + 0.5. 





figure 14.11 The radio galaxy 3C295 in Bootes. The spectrum of this galaxy, shown 
elow, reveals a red shift * = 0.46, the largest yet observed for any galaxy. This 
photograph and spectrograph were taken with the 200-in. telescope at Mt pLmar 
t Courtesy Mt. Wilson and Mt. Palomar observatories.) 


447 


448 


14 Cosmography 


Meanwhile, red shifts were becoming available for radio galaxies. In I960, 
R. Minkowski 53 discovered that one of these, 3C295, has red shift z — 0.46, the 
largest yet known for any galaxy. (See Figure 14.11.) Sandage 34 in 1968 included 
these new red shifts, along with the data considered earlier by Humason, Mayall. 
and Sandage 48 and Baum, 40 in a study of 41 first-ranked cluster members. His 
data, transformed to the blue magnitude system, could be well fit by the relation 

m B,coTT = m B ~~ k B — A b — 5 ln 10 cz — 6.06 (14.6.15) 

where c = 3 x 10 5 . (See Figure 14.12.) The dispersion of points about this curve 
was only about +0.3 magnitudes, indicating that these brightest galaxies really do 
have a uniform absolute magnitude M B . The Hubble constant is thus reliably given 
by (14.6.10) and (14.6.15) as 

5 ln 10 H 0 (km/sec/Mpc) — M B + 31.06 (14.6.16) 



Figure 14.12 Red shifts and corrected apparent magnitudes for 42 first-ranked cluster 
galaxies. Data are taken from a 1970 review of Sandage. 44 Curves represent fits of 
Fq. (14.6.10) to the data. 


Using globular clusters instead of brightest stars to fix the distance of the Virgo 
cluster, Sandage 34 estimated that brightest E galaxies have M B = —21.68, so 
that 

H 0 = 75.3_ 15 + 19 km/sec/Mpc = [13.0_ 2 7 +3,7 x 10 9 years] -1 


(14.6.17) 
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[The error quoted here includes uncertainties of ±0.3 magnitudes in the apparent 
magnitude of the brightest globular cluster in M87 ; ±0.2 magnitudes in the 
apparent magnitude of the first-ranked Virgo galaxy NGC4472; and ±0.3 magni- 
tudes in fitting the data with (14.6.15).] The persistence of a straight-line fit out 
to z — 0.46 indicates according to (14.6.10) that q e f cannot be too different from 
unity. Peach 54 gives a value, with evolution neglected: 


q 0 = 1.5 ± 0.4 

while Sandage gives 55 

q 0 = 1.2 ± 0.4 


(14.6.18) 

(14.6.19) 


What have we really learned from this forty-year program of astronomical 
observations? There is little doubt that (14.6.15) is a good fit to the data for small 
z, so that H 0 is given by Eq. (14.6.16). These results have hardly changed at all 
since Hubble’s work in 1936. What has changed dramatically is the distance scale, 
which controls our estimates of M B for first-ranked cluster galaxies, and hence 
plays a crucial role in the determination of H 0 . A recent survey by Sandage gives 44 

50 km/sec/Mpc < TT 0 <130 km/sec /Mpc 
or 

20 x 10 9 years > 1 > 7.5 x 10 9 years 


and this probably represents a fair estimate of the range of possible error in H G 
due to uncertainties in the distance scale. Another change since 1936 is a tripling 
of the available range of red shifts. We can now be reasonably confident that q G { 
is between \ and •§. However, the role of evolutionary and selection effects is still 
very much in doubt. If H 0 — 75 km/sec/Mpc, and if galactic luminosities increase 
at the rate (14.6.13), then the true deacceleration parameter q 0 is related to the 
observed quantity q Q ff by 

9o = So" + °- 5 

This correction is highly uncertain; remember that it was believed to have the 
opposite sign a few years ago ! Thus we now know H 0 to within a factor of 2, and it 
seems likely that q 0 > 0, indicating gravitational braking, but about the precise 
value of qo we know almost as little as we did in 1931. (As this book goes to press, 
rumor has it that Ho is going down again, perhaps even below 50 km/sec/Mpc.) 

In 1963 a discovery was made by Maarten Schmidt, 56 which at first seemed to 
offer hope of a tremendous improvement in our knowledge of the cosmic scale 
factor. Since 1960, a number of radio sources had been identified with quasi-stellar 
objects, optical sources whose angular diameters are too small to be resolved at 
Palomar. Schmidt discovered that one of them, 3C273, had a red shift z = 0.158, 
corresponding to a luminosity distance (if H 0 — 75 km/sec/Mpc) of 630 Mpc. At 
this distance, its absolute luminosity would have to be greater than that of a whole 
galaxy, even though its small angular diameter ( <0.5") implied a size less than 
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1500 pc. From 1963 until the present, several hundred quasi-stellar objects have 
been discovered, 57 of which a good fraction have z > 1, and a few have z > 2. 
At the same time, the use of lunar occultation and long-base-line radio inter- 
ferometry, and the observation of short-period time variations, made it clear that 
much of the enormous energy output of these objects comes from regions much less 
than 1 pc in diameter. The discovery of the quasi-stellar objects therefore revived 
interest in theories of gravitational collapse, already discussed in Chapter 11. It 
also opened up the possibility of extending the empirical relation between d L and z 
out to really large distances and red shifts, provided that some method could be 
found to determine the absolute luminosities of the quasi-stellar objects. 

Unfortunately, the plot of m v versus In z reveals no clear correlation of 
apparent magnitude with red shift. 57,58 If quasi-stellar objects are indeed at 
cosmological distances (and about this there remains some doubt 59, 68 ), then they 
must have an extremely wide spread in absolute luminosities. The comparison of 
red shifts with apparent magnitudes will become cosmologically interesting for 
quasi-stellar objects only when we learn how to distinguish quasi-stellar objects of 
different absolute luminosity. 

It is nevertheless an interesting question of principle, to ask what could be 
learned about k and R{t ), if the luminosity distance could be exactly determined as 
a function d L (z ) of red shift? It seems to be generally believed that knowledge of 
d L (z) would allow a unique determination of k and R(t). However, this is not the 
case. 60 The governing theoretical equations here are (14.3.1), (14.3.6), and (14.4.14). 
Equation (14.3.1) can be replaced with an equivalent differential equation 


= (1 _ ^- 1 / 2^1 

R{tC) dz dz 

with the initial condition that 

= t 0 for = 0 


(14.6.20) 


(14.6.21) 


Equations (14.3.6) and (14.4.14) simply serve to eliminate the unknowns R{t t ) 
and r 1? so that (14.6.20) becomes 


(i + = -[i - *tf- 2 (< 0 )( i + zr 2 <*L 2 c>] 

dz 


1/2 ~ [(i + zr'djtfi 
dz 


Thus t t can be calculated as a function of z by a single integration 

t^z) = t 0 - J (1 + z') -1 [l - kR~ 2 {t 0 ){l + z'y 2 d L 2 {z')Y 112 


d 


x — [(! + z') ^dAz’)} dz' 
dz 
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and the function R(t) can then be determined by solving the functional equation: 


t = t 0 


r[K(f 0 )/KtO-iJ 

(i + z)- 1 ^ - wr 2 (f 0 xi + zr 2 d L 2 (z)r l/2 

0 


x — [(1 + z)- l d L (z)]dz (14.6.22) 

dz 


Note that this procedure will give a solution for any assumed values of the 
constants k, R(t 0 ), or t 0 . Hence there is no way that measurements of luminosity 
distances and red shifts can determine k or R(t 0 ), unless we supplement the Robert- 
son -Walker metric with dynamical equations for R(t), as will be done in Chapter 
15. This curious ambiguity can also be observed in calculating the expansion of 
d L (z) in powers of z; the first-order term depends on R(t 0 )IR(t 0 ); the second-order 
term depends on R(t 0 )jR(t 0 ) and R(t 0 )jR(r 0 ) ; the third-order term depends on 
R(t 0 )IR(t 0 ), R(t 0 )IR(t 0 ), R(t 0 )IR(t 0 ), and &/A 2 (£ 0 ); and terms of order z N with 
N > 3 depend on kjR 2 (t 0 ) and the first N logarithmic derivatives of R(t) at t () . 
Thus no measurement of any number of derivatives of d L (z) can ever allow us to 
determine kjR 2 {t 0 ). However, once we assume values for k and R(t 0 ), Eq. (14.6.22) 
will allow us to compute R(t) as a function of t — t 0 from the empirical relation 
between d L and z. 

In principle, we could also determine the form of the function R(t) by observing 
a single spectral line for a long enough time. According to Eqs. (14.3.6) and 
(14.3.1), the red shift of a comoving source changes at a rate 


dz R{t 0 ) i?(^ 0 ) J R(^ 1 ) fdt r 

dt 0 R(t i) R 2 {tt) V^o 

_ — R{t 1 ) 

R{tf) 


(14.6.23) 


For 2 1, we can approximate t 0 — t l by the first term in the series (14.6.5), 

and (14.6.23) reads 


1 dz 
z dt 0 


H 0 R(t 0 ) 


— — q 0 TJ 0 


(14.6.24) 


It does not seem possible to measure this very slow change in red shift with present 
techniques. 61 


7 Number Counts 

Since the Hubble program has not yet succeeded in telling us very much about 
the cosmic scale factor R(t), it is natural to widen our scope, and consider numbers 
of optical or radio sources, as functions of apparent luminosity and/or red shift. 
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The study of number counts offers two potential advantages over the Hubble 
program : 

(A) The development of radio telescopes of very large aperture and sensitivity 
has led to the detection and resolution of thousands of faint radio sources, most of 
which are presumably at very great distances. The majority of these sources have 
not yet been identified with optical objects, so their red shifts are as yet unknown. 
(No radio lines have been observed from resolved radio sources, and their red shifts 
can therefore only be measured optically.) Not knowing z, the best use to which 
cosmologists can put these sources is to plot their number as a function of their 
strength. 

(B) The quasi-stellar objects discussed in the last section have measured red 
shifts ranging up to z « 2, but have too wide a spread in absolute luminosity to 
allow determination of the luminosity distance d L . By plotting the number of 
quasi-stellar objects as a function of z alone, or z and l, we can eliminate some of 
the problems caused by the spread in L. 

To begin in a very general way, let us assume that at time there are 
n(L, t±) dL sources per unit volume with absolute luminosity between L and 
L + dL. The proper volume element is 

dV = y/ g dr x d$ 1 d<j) 1 = jR 3 (f 1 )(l — kr x 2 )~ i ^ 2 r 1 2 dr x sin 9 X d9 x d(f q 

so the number of sources between r t and + dr 1 with absolute luminosity 
between L and L + dL is 

dN = 47TjR 3 (^ 1 )(1 — kr^)^ 1/2 r 1 2 n(t l , L) dr 1 dL (14.7.1) 

The coordinates r x and t x are related according to Eq. (14.3.1), which we can 
write as 


r i = r (h) 

where r(t) is a function defined by the formula 
r*o fif rr(t) 

— = (l - hr 2 )- 1 ! 2 dr 

J, W) Jo 


Differentiation of Eq. (14.7.3) gives 


dr. = -(1 - hr, 2 ) 1 ' 2 — 
R(t t ) 


and Eq. (14.7.1) can therefore be written 


(14.7.2) 


(14.7.3) 


dN — 4nii 2 {t 1 )r 2 (t l )n(t 1 , L) \dt x \ dL 


(14.7.4) 
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The red shift and apparent luminosity of a source of absolute luminosity L at r l9 
are given by Eq. (14.3.6) and (14.4.12): 


__ *(fo) _ i 

l — LR 2 {t i) 

4nr 1 2 R 4 (t 0 ) 


(14.7.5) 

(14.7.6) 


Hence the number of sources, with red shift less than z and apparent luminosity 
greater than Z, is given by the integral of (14.7.4) over all L and a finite range of t t : 

f 00 fto 

N(<z, > l ) = dL 4nr 2 {t 1 )R 2 {t 1 )n{t 1 , L) dt x (14.7.7) 

Jo Jma x{t 2 ,ti(L)} 


where the lower limits, set by the conditions on red shift and apparent luminosity, 
are defined by 


R{t z ) = 


mo) 

(1 + Z) 


(14.7.8) 


rlM s L 

R 2 {t t ) 4n lR 4 {t 0 ) 


(14.7.9) 


If red shifts are not observed, then the quantity of interest is the number N(>1) 
of sources with apparent luminosity greater than l, which can be calculated by 
taking the lower limit in (14.7.7) to be just t t (L). If apparent luminosities are not 
observed, then the quantity of interest is the number N(<z) of sources with red 
shift less than z } which can be calculated by taking the lower limit in (14.7.7) to be 
just t z . (However, the observed number counts can only be used to put a lower 
bound on N(z), not to measure it, because any given optical or radio telescope will 
only detect sources above some minimum brightness.) 

Radio telescopes do not measure total apparent luminosities, but instead 
measure the flux density 8, the power per unit antenna area and per unit frequency 
interval, at a fixed frequency. The flux density of a source at r Y , t x is 


^ _ mmo)imi))mi) 

m(toK 2 


(14.7.10) 


where P is the intrinsic power, the power emitted per unit solid angle and per unit 
frequency interval. (See Eq. (14.4.35), with S = l', P = L f /4n.) 

Following the same derivation that led to Eq. (14.7.7), we find that the 
number of sources, with red shift less than z and flux density at frequency v greater 
than S, is given by 


N{<z, >S ; v) = 


00 dP P° 4i'!ir z (t 1 )E 2 (t 1 )n ( t v P, v ^ 




(14.7.11) 
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where t s (P) is defined by 

r 2 (h) = P 
M(t s ) 8R 3 (t 0 ) 


(14.7.12) 


The analysis of radio source counts is very much simplified by the observation 62 
that such radio sources generally have “straight.” spectra, 


P oc v a 


(14.7.13) 


where oc, the spectral index, is about 0.7 to 0.8. In this case, the number of sources 
with intrinsic power at frequency v between P and P + dP is of the form 


n(t, P, v) dP = n ( t, 


jT-T, voW^T-T) 

KJ / V KJ ) 


where v 0 is any arbitrary fixed frequency. The number density then obeys the 
scaling rule 

r -I* / r 

V 


n{t, P, v) 


= "vl* 

3_ 


n \ t, P 


(14.7.14) 


By changing the variable of integration in Eq. (14.7.11) from P to P[jK(^ 0 )/jK(^)] a , 
we can refer the number density within the integral to the fixed frequency v, and 
find 


N(<z, >S; v) 


dP 


4nr 2 (t 1 )R 2 (t 1 )n(t l , P, v) dt t (14.7.15) 


nmx{f z , i Sa (F)} 


where t Sa (P) is defined by 




a)V 
0 )/ 


1 — a 


SP ( t 0 ) 


The number counts will now obey the scaling rule 
N{<z, > S ; v) = N ( <z, > 8 


; v 0 


(14.7.16) 


(14.7.17) 


To the extent that (14.7.17) is verified by observation, we can conclude that all 
sources do have the “straight” spectrum (14.7.13), with the same spectral index oc. 

If there were no creation, destruction, or evolution of the sources during the 
time it takes for light to reach us from the furthest observed source, then n(t , L) 
and n(t, P, v) would have the simple time dependence (14.2.17): 

n(t, L) = T— T »(<o. L) (14.7.18) 

U{t) J 

13 


n(t, P, v) = 


lSJ ■> 


(14.7.19) 


In this case, the observed number counts could be used to gain information about k 
and P(t). Alternatively, if we had a cosmological model for k and R(t), we could use 
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the observed number counts to deduce the functional dependence of the number 
density n on t and L or P. 

A good deal of insight into the results to be expected from these two different 
modes of analysis can be gained by concentrating on the special cases where z is 
small or l or S is large. The lower limits on the ^-integral in Eq. (14.7.7) and 
(14.7.11) are then close to t 0 , so we can use the general expansions (14.6.1) and 
(14.6.6): 

= R(t 0 ){\ — H 0 (t 0 — t t ) + * • •} 


r (h) ~ R 1 {^o)(^o ^i)(l + ofto h) + * ‘ *} 

We will also express the number densities as expansions in t 0 — t 1 : 

n(t 1 , L ) — ^){1 P 0 (L)H 0 (t 0 t^) -T • • •} (14.7.20) 

m Q ) 


n i t v P, v 


P(t< 


= n(t 0 , P, v){l - v) + 2a 0 (P, v)]H 0 (t 0 - t x ) + * • •} 


where /? 0 measures the rate of change of source density 


P 0 (L) ee H 0 ~ In n(t, L) 
ot 


t= t 0 


p 0 (P,v) = H 0 ~' ^|ln »(«, P, v,) 


and a 0 is an effective spectral index 


(14.7.21) 

(14.7.22) 

(14.7.23) 


2a 0 (P, v) = -v — In n(t 0 , P, v) 
dv 


(14.7.24) 


(The motivation for this last definition will be made clear below.) Then, for t close 
to Iq, 


rt° 

4nr 2 (t l )P 2 (t i )n(t l , L) dt x 

= ~ n(t 0 , L)(t 0 - <) 3 {1 - $[p 0 (L) + l]ff 0 (t 0 -«)+•••} (14.7.25) 

o 

J 4 ^r 2 (l 1 )P 2 (< I )re^ 1 ,P,v^| 2 j^d« 1 

= ^ »(*<>> p > v)( ( o - <) 3 {! - i[0o( p > V) + 2 a 0 (P, v) + 1]P 0 (1 0 -«)+■•■} 
o 

(14.7.26) 


If z is small, then the lower limit on the integral (14.7.25) is determined by (14.7.8), 
which gives 


tf 0 (*0 ” ^ z ) — Z 



+ • ' • 
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Hence (14.7.7) gives the number of sources with red shift less than 2 as 


4 71 


O 


/*oo 

I n (toi 


£){1 - HPo( L ) + 2q 0 + 5]z + • • *} dL 

(14.7.27) 

If l is large, then the lower limit on the integral (14.7.25) is determined by (14.7.9), 


which gives 




LV /2 f _ *(LH 0 2 \" 2 


4*nl 


2 y M 


Now (14.7.7) gives the number of sources with apparent luminosity greater than 
l as 

N{>1 ) = ^ (4 nl)~ 3/2 f n(t 0 , L) 

* Jo 

x |l - }[jg 0 (L) + 7] + ■ • ■ J i 3/2 dL (14.7.28) 

Finally, if S is large, then the lower limit on the integral (14.7.26) is determined 
by (14.7.12), which gives 


' py / 2 

to-t s (P) = iy u 




/ 2 


Thus (14.7.11) gives the numbers of sources, with flux density at frequency v 
greater than S, as 


N(>S, v) = — S~ 3/2 
3 


n(t 0 , Py v)P 3/2 


/2 


/pfj 2\y 

X ^1 - V) + 2a 0 (Py V) + 5] + 


dP 


(14.7.29) 


If all sources have the “straight"’ spectrum (14.7.13), then (14.7.14) and (14.7.24) 
give 


a 0 (P, v) = 


1 + P — In n(t 0 , P, v) 

dP 


so integration by parts allows us to make the substitution 

*o(P> v) ^ a 


(14.7.30) 


(14.7.31) 


That is, the “effective spectral index" a 0 may be replaced with a in Eq. (14.7.29), 
provided that sources have the spectrum P oc v ~ a . 

Inspection of these results shows that measured values of N(<z) for z 1 
could be used to deduce the deacceleration parameter q 0 if we knew the evolution- 
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ary parameter fi 0 , while, in contrast, measurement of N{>1) or iV(>*S', v) for 
large l or S cannot tell us anything about q 0 , whatever we assume for /? 0 . 

If we assume no evolution, then n has a time dependence given by (14.7.18) 
or (14.7.19) and (14.6.1), so (14.7.22) and (14.7.23) give 

p 0 {L) = /? 0 (P, v) = — 3 (no evolution) (14.7.32) 

In this case, (14.7,27)-(14.7.29) give 

n(t 0 , L) dL{l - j(q 0 + l)z + ■ • •} (14.7.33) 

+ ---| L 3 ' 2 dL (14.7.34) 

and, for straight spectra, 

N( > S, v) = y S~ 3 ' 2 J*°° n(t 0 , P, v) jl - Ka + 1) ^ + • • ' j P 3 ' 2 dP 

(14.7.35) 

Thus the neglect of evolution leads to the definite predictions that N( >1) must 
decrease with l more slowly than l~ 3/1 , and, since a is positive, N(>S, v) must 
decrease with S more slowly than S~ 3/2 . 

However, this result appears to be contradicted by observation 69 . The relevant 
radio source surveys are listed in Table 14. 1 . J ointly and severally, they yield a num- 
ber count function 6 3 N( > S, v) that decreases with S (for S >5 x 10~ 26 Wm~ 2 x 
Hz“ *) roughly like S~ 1 * 8 , and definitely more rapidly than S~ 3/2 . We are forced 
to the conclusion that evolution is important. According to Eq. (14.7.29), the 
decrease of S 3/2 N(>S, v) with S requires that 

P 0 < -2oe 0 — 5 ~ -6.5 (14.7.36) 

so that the number density of sources must be decreasing faster than R(t)~ 6 ' 5 . 

A similar conclusion is reached from the study of number counts of radio 
sources as a function of their angular diameters. From a study of the distribution 
of the sizes of the radio sources in the 3C catalogue, Longair and Pooley 64 cal- 
culated the distribution in angular diameters to be expected for the fainter sources 
in the 5C catalogue if no evolutionary effects are important. Their result does not 
agree with observation for any q 0 , indicating an evolutionary decrease of the 
proper source density. 

If evolution is as substantial as indicated by these source counts, we evidently 
cannot use the source counts to learn much about R(t). We shall return to the 
program of using source counts to learn about source evolution in the next chapter, 
when we have a dynamical model for E(t). 


N(>1) = ^ ( 4 nir i/2 


T TJ 2 

n(t 0 , L)\ 1 - 3 |^ 
4nl 


N(<z) = 


4:71 


H ( 


-3 3 



Table 14.1 The Major Radio Source Surveys* 1 


^Vnin 


Observatory 

Survey 

v (MHz) 

Sources 

(10- 26 Wm“ 2 Hz 

Cambridge 

3C 

159 

471 

8 


3CR 

178 

— 

9 


4C 

178 

4843 

2 


5C 

408 

276 

0.025 


WKB 

38 

1069 

14 


RN 

178 

87 

0.25 


NB 

81.5 

558 

1 

Mills Cross 

MSH 

86 

2270 

7 

Parkes 

PKS 

408,1410,2650 

297 

4 


PKS 

408,1410 

247 

0.5 


PKS 

408,1410 

564 

0.3 


PKS 

408,1410 

628 

0.4 


PKS 

635,1410,2650 

397 

1.5 

Owens 

CTA 

906 

106 

— 

Valley 

CTB, CTBR 

960 

110 

— 


CTD 

1421 

— 

1.15 

National 

NRAO 

750,1400 

726 

(3C and 3CR) 

Radio 

Observa- 

NRAO 

750,1400 

458 

0.5 

tory 

Bologna 

B1 

408 

629 

1 


B2 

408 

3235 

0.2 

Ohio State 

O 

1415 

128 

2, 0.5 


O 

1415 

236 

0.37 


O 

1415 

1199 

0.3 


0 

1415 

2101 

0.2 

Vermillion 

VRO 

610.5 

239 

0.8 

River 

VRO 

610.5 

625 

0.8 

Dominion 

DA 

1420 

615 

2 

Radio 

Observa- 

tory 

Dwingeloo- 

DW 

1417 

188 

2.3 

NRAO 

Arecibo 

AO 

430 

25 

— 


a The different surveys cover different, partially overlapping regions of the sky, and not all are 
complete within their region and flux range. For further details and references, see A. G. Pacholczyk 
Radio Astrophysics (W. H. Freeman and Co., San Francisco, 1970), pp. 241 ff. 
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8 lhe Steady State Cosmology 

Our work so far has been based on the “Cosmological Principle” that the 
universe is spatially isotropic and homogeneous. Hermann Bondi and Thomas 
Gold 65 have gone one step further, and have suggested that the universe obeys a 
“perfect Cosmological Principle,” that it looks the same not only at all points and 
in all directions, but at all times. This assumption leads to a steady state model of 
the universe, which was suggested at about the same time by Fred Hoyle, 66 on 
the basis of an alteration in the structure of the energy-momentum tensor appear- 
ing in Einstein’s field equations. We shall follow the Bondi- Gold approach here, 
as more suited to the spirit of the present chapter, and will come back to Hoyle’s 
theory in the last chapter. 

The work of Section 14.6 shows that the Hubble “constant” E(t 0 )IE(t 0 ) is an 
observable parameter, so that it must be independent of the present time t 0 in a 
steady state model. Letting H denote the permanent value of the Hubble constant, 
we have then 

I{(1) = H for all < 

R(t) 

and therefore 

E(t) = E(t 0 ) exp {H(t - f 0 )} (14.8.1) 

In this model the deacceleration parameter takes the permanent value 

? - ~ ^2 = _1 (14.8.2) 

K 

To determine Jc, we return to the general relation (14.6.22) between E(t) and the 
luminosity-distance versus red-shift function d L (z), which now reads 

/*[exp{ff(*o — f)} — 1 ] 

<0-1=1 (1 + *)-*[ 1 - kR- z (t 0 )(l + z)- 2 d L 2 (z)] 1/2 

x — [(1 + z) _1 d L (z)~\ dz (14.8.3) 
dz 

Since d L {z) is observable, it must now be independent of t 0 . Hence, in order that the 
integral depend only on t — t 0 , not t or t 0 separately, it is necessary that 

k = 0 (14.8.4) 

The metric is then 

dz 2 = dt 2 — E 2 (t 0 )e 2H(t ~ to) {dr 2 + r 2 d6 2 + r 2 sin 2 6 dtp 2 } (14.8.5) 

This derivation might be challenged on the grounds that the metric (14.8.5) was 
obtained as a special case of the Robertson Walker metric, which was derived in 
Sections 14.1 and 14.2 on the basis of a definition of cosmic time that makes no 
sense in an unevolving universe. We could avoid this difficulty by viewing (14.8.5) 
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as the limiting metric for a universe with an extremely slow rate of evolution. A 
more satisfactory approach is to derive (14.8.5) directly from the assumption that 
the whole four-dimensional space-time is maximally symmetric. This assumption 
is shown in Section 13.3 to lead to the metric (13.3.41), which is identical to 
(14.8.5), except that a factor E(t 0 ) exp ( — Ht 0 ) must be absorbed into the radial 
coordinate r. Comparing (13.3.41) and (14.8.5), we note that the curvature constant 
of the four-dimensional space-time of the steady state model is 

K = H 2 (14.8.6) 

The space-time is curved, even though the space is flat. 

The most remarkable feature of the steady state cosmology is not its space - 
time metric, but rather, the necessity of continuous creation of matter. According 
to Eq. (14.2.21), the proper distance between any two comoving galaxies increases 
as E(t), so if the average number of galaxies per unit proper volume is to remain 
constant, new galaxies must appear to fill up the holes in the widening comoving 
coordinate mesh. To put this formally, we recall that in the comoving coordinate 
system r6<j)t , the current vector of the galaxies and the total energy-momentum 
tensor are given by ( 14.2. ll)-( 14.2. 14) as 

J g m — 

T ^ = (p + p)U^U v + pg 

with 

U* = 1 U r = U 9 = U* = 0 

In accordance with the spirit of the steady state model, we now take n G , p , and p 
to be constant in time as well as space. Then J ^ and T^ v are not conserved, but 
rather 


J g* 1 ;n — ^ 3 (^) p {E 3 (t)J ' G ) — 3 n G H (14.8.7) 

ct 

T*'., = *~ 3 (0 r (-R 3 (<)[i> + Pl) = 3 (p + p)H (14.8.8) 

Ct 

That is, a comoving observer using a locally inertial coordinate system will see 
galaxies created at a rate 3 H per existing galaxy, and will see energy created at a 
rate 3 H per existing mass plus enthalpy. The present density of the universe is 
roughly of the order 10“ 6 nucleons /cm 3 , so with H~ 1 = 10 10 years this would 
require an average creation of order 10“ 16 nucleons/ cm 3 /'year. The steady state 
model is silent as to whether this new matter is created as hydrogen, or protons 
plus electrons, or neutrons, and it does not tell us whether this new matter appears 
near the old matter or in the depths of intergalactic space. However, violent events 
do seem to be occurring in the nuclei of many galaxies, so galactic nuclei seem like 
natural candidates for the location of continuous creation. 

The steady state cosmology makes very definite predictions as to the correla- 
tion of luminosity distance with red shift. According to (14.3.1), if light leaves a 
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comoving source at time t x and arrives at the origin at time t 0 , the source must be 
at a coordinate r i given for Jc — 0 by 

no a* 

r x = r(t x ) = — - = H~ 1 R~ 1 (^ 0 ){ ex p [H(t 0 ~ * 1 )] - 1} (14.8.9) 

Jo diit) 

Also, (14.3.6) gives the red shift of such a source as 

z = exp [H(t 0 — ^)] — 1 (14.8.10) 

so the luminosity distance of the source is given by (14.4.14) as 

d L (z) = 1 + z) (14.8.11) 

As a check, note that (14.8.9)-(14.8.11) agree with (14.6.6), (14.6.4), and (14.6.8) 
for the appropriate deacceleration parameter q = —1. This value does not seem 
to agree with the q Q determined from the observed d L versus z relation, as discussed 
in Section 14.6. 

The “angular diameter” distance d A is given in this model by (14.4.22) and 
(14.8.11) as 

(14.8.12) 

(1 + *) 

Note that d A (z) approaches the finite constant H~ 1 as z -> 00 . Hence objects with 
large red shifts look very faint, but their angular diameters do not shrink below a 
minimum value. If H ~ 1 is 3 x 10 9 pc, then a galaxy of diameter 10 4 pc will never 
appear smaller than about 0.6". 

If we count the number of sources with red shift less than z, we must look 
back to a time t z given by (14.7.8) as 

t z = t 0 - H- 1 In (1 + z) (14.8.13) 


To count the number of sources with apparent luminosity greater than l , we must 
look back to a time given by (14.7.9), which together with Eq. (14.8.9) now reads 

/ T ff 2 \ 

exp [H{t 0 - «,)] {exp [H(t 0 - 1,)] - 1} = (^H 
The solution is 


ti(L) — ^0 ~ H 



(14.8.14) 


The ^-integral in Eq. (14.7.7) can be done explicitly, and we find the number of 
sources with red shift less than z and apparent luminosity greater than l as 

roc 


N(<z, >1) = 


0 


n{L) min {V(t z ), F(£ ,(£))} dL 


(14.8.15) 
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where V is the volume 


V(t) 


Ho 


47rr 2 (£ 1 )i2 2 (£ 1 ) dt t 


= 4nH 3 {H(t 0 - t) - f + 2 exp [-H(t 0 - $)] - i exp [~2H{t 0 - £)]} 

(14.8.16) 


and n(L) dL is the time-independent proper number density of sources with 
absolute luminosity between L and L + dL . 

As a special case ofEq. (14.8.15), we note that the number of sources with red 
shift less than z is 


N( <z) = 4c7iH 3 n 


In (1 + z) - 


z(l + 3z/2)] 

a + *> 2 j 


where n is the total number density of sources 


(14.8.17) 


n 


'00 

n(L ) dL 
0 


This result is independent of any assumptions concerning the luminosity distribu- 
tion of sources, and of course no evolution of the source density or luminosity 
distribution is possible in a steady state universe. However, with the limited 
statistics now available, it appears that (14.8.17) does not agree with the observed 
red-shift distribution of the quasi-stellar radio sources. 67 In particular, the 
observed red-shift distribution of the quasi-stellar sources shows a pronounced 
peak 57 near z = 1.95, which is absent in Eq. (14.8.17). It should be noted though 
that the observed value of N( >z) should in general be smaller than the theoretical 
prediction (14.8.17), because some sources are not counted if their optical or radio 
strength is too small. 

As another special case of Eq. (14.8.15), we note that the number of sources 
with apparent luminosity greater than l is 



In contrast with N(< z), the result here depends on the details of the distribution 
function n{L). 

A quantity of greater observational interest is the number of radio sources 
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with strength at frequency v greater than S. If all sources have the same “straight” 
spectrum P cc v -a , then (14.7.15) gives the number of such sources as 


N(>S; v) — j n(P, v)V(t Sa {P)) dP (14.8.19) 

where n(P, v) dP is the number of sources with intrinsic power at frequency v 
between P and P + dr ; V(t) is given by (14. 8. 16); and i SaL {P) is determined by 
(14.7.16), which now reads 


exp [i(3 + a)H{t 0 - t s J] - exp [i(l + a)H(t 0 - t Sa )] = 


PH 2 y /2 


(14.8.20) 


This equation cannot be solved analytically for the observed spectral index 
a ~ 0.7. However, it follows from (14.8.20), (14.8.19), and (14.8.16) that N( >S, v) 
decreases more slowly than S~ 212 for all source strengths, in contradiction with 
observed number counts, which decrease more rapidly 63 than S~ 3/2 for S greater 
than about 4 x 10 ~ 26 Wm" 2 Hz - 1 , and only then begin to decrease more slowly 
than $ _3/2 . In the last section we saw that these observations are also inconsistent 
with the results of nonsteady state cosmqlogies, but in that case the discrepancy 
could be removed by assuming an evolution of source densities, while in the steady 
state cosmology no evolution of the source density is allowed. 

As a check, we note that the quantities f3 0 (L) and /3 0 (P , v) defined by Eqs. 
(14.7.22) and (14.7.23) must vanish in the steady state model 

Po(L) = v) = 0 

Hence (14.7.27), (14.7.28), and (14.7.29), with q 0 = — 1 and straight spectra, now 
give the number counts for “nearby” sources as 


N(<z) = 
N(>1) = 


N( > S, v) = 



(14.8.21) 


£ 3/2 dL 


(14.8.22) 

p 3/ 2 gp 

(14.8.23) 


in agreement with the power- series expansions of the general formulas (14.8.17), 
(14.8.18), and (14.8.19). 

The steady state model does not appear to agree with the observed d L versus z 
relation or with the source counts for N(<z) and N(>S, v). In a sense, this dis- 
agreement is a credit to the model; alone among all cosmologies, the steady state 
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model makes such definite predictions that it can be disproved even with the limited 
observational evidence at our disposal. The steady state model is so attractive that 
many of its adherents still retain hope that the evidence against it will eventually 
disappear as observations improve. However, if the cosmic microwave radiation 
discussed in the next chapter is really black -body radiation, it will be difficult to 
doubt that the universe has evolved from a hotter denser early stage. 
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“Now entertain conjecture 
of a time, when creeping 
murmur and the poring 
dark fills the wide vessel of 
the universe.” 

William Shakespeare, King Henry 
the Fifth 


15 COSMOLOGY: THE 
STANDARD MODEL 


In the last chapter we laid out the coordinates for a map of the universe in 
space and time. Now we must begin to fill in this map with the islands of matter 
and the seas of radiation that make up the physical contents of the universe. 

For the most part, we shall continue to base our discussion on the assumption 
of isotropy and homogeneity, now supplemented with Einstein’s field equations. 
The future of the universe then depends critically on its curvature : If the universe 
is open, it will go on expanding forever, whereas if it is closed, its present expansion 
will eventually cease and be succeeded by a general contraction. The curvature in 
turn depends critically on the present energy density p 0 ; the universe is open or 
closed according to whether p 0 is less or greater than a critical value p c , of order 
10“ 29 g/cm 3 . It appears that p 0 mostly arises from the rest-mass of ordinary 
matter — neutrons and protons. In this case, the universe is open and p 0 is less than 
p c if the deacceleration parameter q 0 is less than whereas the universe is closed 
and p 0 is greater than p c if q 0 is greater than \ \ this justifies the emphasis on the 
measurement of q 0 throughout the last chapter. However, the observation that 
2o 1 conflicts with the mass density observed in galaxies, which is considerably 
less than p c . This discrepancy has led to an intensive search for signs of an inter- 
galactic gas, a search that has so far been quite unsuccessful. 

Looking back in time, we find that any isotropic homogeneous universe 
governed by Einstein’s equations must have started with a singularity of infinite 
density. Dating from this singularity, the age of the universe must be less than 
H 0 ~ 1 , and less than f H 0 ~ 1 if q 0 > Radioactive dating and the theory of stellar 
evolution give uncertain ages, ranging from 7 x 10 9 to 16 x 10 9 years, but it 
would be difficult to admit an age much less than § H 0 ~ 1 . 
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The most prominent relic of the hot early universe is the 2.7°K microwave 
radiation background, predicted in 1950 and observed in 1965. The weight of the 
data is so far consistent with the expectation that this radiation has a Planck 
black- body spectrum and is perfectly isotropic. Knowing the present radiation 
temperature, we can trace the thermal history of the universe back to the first few 
minutes, and calculate the production of complex nuclei in the primordial fireball. 
A fairly definite prediction emerges, that about 27% of the nucleons in the early 
universe should have been fused into He 4 . This is in agreement with some measure- 
ments of the present cosmic helium abundance, but in disagreement with others. 
Another relic of the early universe is our present cosmic morphology: Stars form 
galaxies, galaxies form clusters, and clusters form a more or less homogeneous gas. 
Our present theoretical understanding of how this structure evolved is not in good 
shape, but it is clear that the radiation background played an important role. We 
can also speculate on the first few seconds of cosmic history, when the temperature 
was high enough to produce mesons, baryons, and antibaryons in large numbers ; 
so far there does not seem to be any way to check the results of these speculations. 

The preceding summary describes what may be called the “standard model” 
of the universe, based on the Cosmological Principle and Einstein’s field equations. 
One other “standard” assumption, which plays an important role in Sections 
15.7-15.11, is that the distant galaxies are, like our own, composed of baryons 
rather than antibaryons. It has often been suggested that since barvon number is. 
like charge, exactly conserved, the universe ought to contain equal numbers of 
baryons and antibaryons as well as positive and negative charges. However, it 
should be kept in mind that baryon number is really unlike charge ; There are long- 
range forces associated with charge but, as far as we know, not with baryon 
number. Indeed, in a finite universe the total charge must be zero, as can be 
immediately seen by integrating the Maxwell equation V • E = e over the volume 
of the universe ; no such conclusion can be derived for baryon number. In any case, 
even if the net baryon number of the universe is zero, baryons and antibaryons 
must somehow have become separated at some time in the past, and most of the 
considerations of this chapter are applicable to the evolution of the universe after 
that time. 

Of course, the standard model may be partly or wholly wrong. However, its 
importance lies not in its certain truth, but in the common meeting ground that it 
provides for an enormous variety of observational data. By discussing these data 
in the context of a standard cosmological model, we can begin to appreciate their 
cosmological relevance, whatever model ultimately proves correct. Some other 
possible models are discussed in the next chapter. 


1 Einstein’s Equations 

Let us begin our discussion of dynamical cosmology by considering the 
constraints imposed by Einstein’s field equations on the metric for a general iso- 
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tropic and homogeneous universe. According to the results of Section 14.2, this 
metric can be chosen to have the Robertson -Walker form: 


?,,= -! 9„ = 0 g ki = E 2 (t)g ij (x) (15.1.1) 

Here t is a cosmic time coordinate ; i and j run over three comoving spatial 
coordinates r, 6, and (p; and g f . is the metric for a three-dimensional maximally 
symmetric space : 

9 rr = (! - kr V 1 9ee = ^ = r 2 sin2 e 

Sij = o for i ± j (15.1.2) 

with 1c equal to +1, — 1, or 0. 


The only nonvanishing elements of the affine connection for this metric are 


Uj = /•'% 

(15.1.3) 


(15.1.4) 


(15.1.5) 

Its Ricci tensor then has the elements 


II 

(15.1.6) 

R n = 0 

(15.1.7) 

R,j =.& tJ - (RR + 2R 1 )g iJ 

(15.1.8) 

where R t j is the spatial Ricci tensor calculated from the metric g { - : 


ij - ^ Tx* + r ' ,r ‘ J r ‘ jr “ 

(15.1.9) 


Instead of calculating R tj directly, we can save ourselves a good deal of work by 
recalling that g u , as the metric of a maximally symmetric space, must necessarily 
have a Ricci tensor of the form (13.2.4) : 


S ij = -2kg „ (15.1.10) 


Together with (15.1.8), this gives the space-space components of the space-time 
Ricci tensor as 

R.j = — (RR -f 2 R 2 + 2h)g ij (15.1.11) 


As shown in Section 14.2, the energy-momentum tensor here must have the 
perfect-fluid form 


= psr„* + (p + pW? 11 * 


(15.1.12) 
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where p and p are functions of t alone, and U ** is given by Eqs. (14.2.13) and 
(14.2.14): 

U' = 1 U l = 0 (15.1.13) 

The source term in the Einstein equations is then 

= i(P ~ P)9»* + (P + P)U„ U , (15.1.14) 

so (15.1.1), (15.1.13), and (15.1.14) give 

S„ = i(P + ip) (15.1.15) 

S it = 0 (15.1.16) 

s ij = i (P - P) R2 8ij (15.1.17) 

The Einstein equations read 


= -SnGS^ 

With (15.1.6), (15.1.7), (15.1.11), and (15.1. 15)— (15.1.17), the time-time component 
gives 

3 R = -4nG(p + 3 p)R (15.1.18) 

the space-space components give the single equation 

RR + 2 R 2 + 2 k = 4%G(p - p)R 2 (15.1.19) 

and the space- time components give 0 = 0 . 

By eliminating R from (15.1.18) and (15.1.19), we find a first-order differential 
equation for R(t) : 

R 2 + k = — pR 2 (15.1.20) 

3 

In addition we have the equation of energy conservation (14.2,19), 

P R 3 = V {-^[P + i»]} 

(M 

or, equivalently, 

( P Ri ) = -3 p R2 (15.1.21) 


Given an equation of state p = p{p), we can use this equation to determine p as a 
function of R. For instance, if the energy density of the universe is dominated by 
nonrelativistic matter with negligible pressure, then (15.1.21) gives 

forp p 


p cc R 3 


(15.1.22) 
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whereas if the energy density is dominated by relativistic particles, such as 
photons, then p = p/3, and (15.1.21) gives 


p oc R 4 for p = - 
3 


(15.1.23) 


Knowing p as a function of R, we can determine R(t) for all time by solving 
Eq. (15.1.20). The fundamental equations of dynamical cosmology are thus the Einstein 
equations (15.1.20), the energy -conservation equation (15.1.21), and the equation of 
state. The cosmological models, based on a Robertson- Walker metric, in which 
R(t) is derived in this way, are known as Friedmann models. 1 

[Incidentally, the solution R(t) determined in this way will automatically 
satisfy (15.1.18) and (15.1.19), for by differentiating (15.1.20) with respect to time 
and using (15.1.21), we find 


2RR = 


SnGR 
3 R 


-p R1 + L (p R3 ~> 

U/J- i- 


SnGR 
3 R 


{~pR 2 


3 pR 2 ) 


which is just the same as (15.1.18). Equation (15.1.19) then follows trivially from 
(15.1.18) and (15.1.20). The reason we can make do with the single field equation 
(15.1.20), instead of the two equations (15.1.18) and (15.1.19), is of course that 
these two equations are not functionally independent, being related by the 
Bianchi identities to the energy- conservation equation (15.1.21).] 

It is possible to learn a good deal about the past and future expansion of the 
universe by simply inspecting Eqs. (15.1.18)— (15.1.21 ), even without specifying a 
definite equation of state. Equation (15.1.18) shows that as long as the quantity 
p + 3p remains positive, the “acceleration” R/R is negative. Since at present 
R > 0 (by definition), and R/R > 0 (because we see red shifts, not blue shifts), 
it follows that the curve of R(t) versus t must be concave downward, and must have 
reached R(t) — 0 at some finite time in the past. Let us call this time t — 0, so that 

R( 0) - 0 (15.1.24) 

The present time t 0 is then the time elapsed since this singularity, and may justly 
be called the age of the universe. If R(t) vanished for 0 < t < t 0 , then R(t) would 
be just R(t 0 )tlt 0 , and so the age t 0 would just equal the Hubble time H 0 ~ x = 
R(t 0 )IR(t 0 ). With R negative for 0 < t < t 0 , the age of the universe must he less than 
the Hubble time : 

to < Ho” 1 (15.1.25) 

Looking into the future, we see from Eq. (15.1.21) that as long as the pressure p 
does not become negative, the density p must decrease with increasing R at least as 
fast as R~ 3 , so that for R -*■ oo, the right-hand side of Eq. (15.1.20) vanishes at 
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least as fast as R 1 . For k — — 1, R 2 (t ) remains positive-definite, so R(t) goes on 
increasing, with 

R(t) t as t oo for k — — 1 

For k — 0, R 2 {t) remains positive-definite so R(t) goes on increasing, but more 
slowly than t. For k = +1, R 2 (t ) will reach zero when pR 2 drops to the value 
3 / 87 : 6 ?. Since R is negative-definite, R(t) will then begin to decrease again, and 
eventually must again reach R = 0 at some finite time in the future. Hence the 
qualitative course of cosmic history is determined by the sign of the spatial 
curvature : If k — — 1 or k — 0, then the universe will go on expanding forever, 
whereas if k — +1, then the expansion will eventually cease and he followed by a 
contraction hack to a singular state with R(t) = 0. 

The combination of the Cosmological Principle with the Einstein field equa- 
tions illuminates some of the profound questions raised by Newton and Mach. 
(See Section 1.3). Suppose that we want to study some physical system 8, such as 
the solar system or the rotating bucket of Newton, whose size is much less than the 
cosmic scale factor R. We can imagine S to be placed in a spherical cavity, cut out 
of the expanding universe, and so long as the size of this cavity is much less than 
R, we can safely consider this cavity to be empty apart from the system S. If S 
were absent, the gravitational field inside the cavity would be a spherically 
symmetric field with R^ v — 0, and hence, according to the Birkhoff theorem 
(see Section 11.7), it would have a flat-space metric equivalent to the Minkowski 
metric As long as the system S is not too big, we can then calculate its gravi- 
tational field as a perturbation on r]^ v , ignoring all matter outside our cavity, and 
we can determine the behavior of the system by using Newtonian or special- 
relativistic mechanics. The question of what determines the inertial frames is now 
answered, for the only reference frames in which the whole universe appears 
spherically symmetric, so that BirkhofFs theorem applies, are the frames at the 
center of our cavity, which do not rotate with respect to the expanding cloud of 
“typical” galaxies. The inertial frames are any reference frames that move at 
constant velocity, and without rotation, relative to the frames in which the 
universe appears spherically symmetric. 

These remarks lead to an alternative derivation 2 of the dynamical equations 
for an expanding universe. If we mentally draw a comoving spherical surface 
anywhere in the universe, then as long as its proper radius is much less than R(t), 
the galaxies within this sphere will move under the influence of their own gravita- 
tional field, and the gravitational field of the rest of the universe may be neglected. 
We can then think of the universe as consisting of a Newtonian gas in a state of 
everywhere -uniform expansion. Any given gas particle will have a trajectory 

x< ° = x<w IS 

with R(t) a scale factor common to the whole gas. [Note that the gas appears the 
same to an observer mounted on any gas particle as it does to an observer at the 
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origin. Also, the “comoving” coordinates of a given gas particle are not x l (t), but 
rather r l = The gravitational potential energy V of such a particle just 

arises from the matter within a sphere of radius |x(£)| and center at the origin, so 

= _| TOG |x ( y|^) Jg 

where m is the particle mass and p{t) is the uniform mass density of the gas. The 
kinetic energy of this particle is 

T(t) = |m|x («)| 2 = \m\x{t 0 )\ 2 

K (c 0 ) 

and its total energy is 

E = T(t) + V(t) = im [ R 2 (t) - p(t)R 2 (t) 

H (£ 0 ) (_ 3 

With E constant, this is just the same as Eq. (15.1.20), provided that we identify 
the energy of a particle as 

(15.1.26) 

R (to) 

For k = —l,Eis positive-definite, so gravitation cannot prevent the gas from 
dispersing to infinity, with a finite asymptotic velocity. For k = 0, E vanishes, 
and the gas is just barely able to expand indefinitely. For k — + 1, E is negative, 
and the explosion must ultimately cease and be followed by an implosion. 

Although Newtonian cosmology can reproduce the chief results derived from 
Einstein’s equations, it is essentially incomplete, for several reasons. We need 
general relativity to justify the neglect of all the matter outside a sphere of radius 
|x(£)| in calculating the gravitational potential at x(t). We cannot use Newtonian 
mechanics when the medium itself consists of particles with relativistic local 
velocities. Finally, it is only through the use of general relativity that we are able 
to interpret the observation of light signals correctly in terms of the cosmic scale 
factor E(t). 


2 Density and Pressure of the Present Universe 

At the present instant, the pressure and energy density of the universe are 
given by Eqs. (15.1.18) and (15.1.19) as 
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+ H (152 - 21 

Here R 0 is the present value of the cosmic scale factor R(t), and H 0 and q 0 are the 
Hubble constant and the deacceleration parameter, defined in Section 14.3 as the 
present values of RjR and —RRjR 2 . From (15.2.1) it follows that the spatial 
curvature kjR 2 is positive or negative, according to whether p 0 is greater or less 
than a critical density 


Pc = 


3 H_ 0 2 _ 

87 zG 


= 1.1 x 10 


-29 


( ^ Y g/cm 3 (15.2.3) 

^75 km/sec/Mpc J 


As we shall see, there are good grounds to believe that the energy density of 
the present universe is dominated by nonrelativistic matter, with 


Po < Po (15.2.4) 

If this is the case, then (15.2.2) yields a formula for the spatial curvature in terms 
of the observable parameters H 0 and q 0 : 

T) 2 = (2^0 — l)H 0 2 (15.2.5) 

Kq 

and (15.2.1) gives the ratio of the present density to the critical density (15.2.3) as 


- = 2 q 0 (15.2.6) 

Pc 

For q 0 > i universe is positively curved, with p 0 > p c , whereas for q 0 < \ 
the universe is negatively curved, with p 0 < p c . If we give credence to the values 
q Q ~ 1 and H 0 ~ 75 km/sec/Mpc deduced from the red shift versus luminosity 
relation (see Section 14.6), then we must conclude that the density of the universe 
is about 2p c , or about 2 x 10“ 2 9 g/cm 3 . 

Unfortunately, this result does not agree with the observed density of galactic 
mass. 3 The masses of spiral galaxies within about 15 Mpc can be determined by a 
dynamical analysis of their rotational velocities as functions of distance from the 
galactic centers. The masses of a half-dozen or so elliptical galaxies can be calculated 
from the virial theorem, 4 which gives a mass 

M = (15.2.7) 

<?<<r s > 

where <y 2 ) is the mean-square velocity relative to the center of mass, and (d~ 1 y 
is the mean reciprocal separation between stars. The total masses of pairs of 
galaxies can be determined statistically from their relative velocities and separa- 
tions, under the assumption that the pairs are oriented at random with respect to 
the line of sight. 
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In all three of the above methods, the galactic mass is given by a formula 
of form 


M = 


\iV 2 D 

G 


(15.2.8) 


where V is some characteristic internal velocity, D is some characteristic dimension 
of the object under study, and fi is a dimensionless number of order unity, which 
depends on the details of the method used and the object studied. The character- 
istic distance D is measured from the corresponding angular dimension S and the 
cosmological red shift z, using Eqs. (14.4.15) and (14.6.7), which for z 1 give 


D = — (15.2.9) 

H 0 

(Tor nearby galaxies, the “angular diameter distance” Dju might be determined 
from the apparent magnitude of brightest stars, brightest globular clusters, and 
so on, rather than from the red shift. However, if such distance determinations 
form part of the cosmic distance ladder used to measure the Hubble constant, then 
any error in these distances would also show up in the Hubble constant, so that D 
would still scale like 1 jH 0 .) The internal velocities V are measured directly from the 
distribution in red shift around the average value 2 for the galaxy. It is con- 
venient to describe the masses determined in this way in terms of a mass-to- 
luminosity ratio M/L, the absolute luminosity L being given in terms of the 
apparent luminosity l by Eqs. (14.4.12) and (14.6.7), which for a small red shift z 
yield 

L = 4nlz 2 H 0 ~ 2 (15.2.10) 

From (15.2.8)-(15.2.10), it follows that the ratio MjL determined by the three 
methods described above is proportional to the value assumed for the Hubble 
constant H 0 . 

With H 0 taken as 75 km/sec/Mpc, it appears 3 that the galactic mass-to-light 
ratio MjL for elliptical galaxies is about 50 times the solar ratio M 0 jL Q , whereas 
for spiral galaxies estimates of M/L range from 1 to 20 times M 0 /L 0 . According 
to a survey of these MjL values by Oort, 5 the overall mass-to-light ratio for all 
galaxies is about 21 M 0 jL Q . Since the Hubble constant may well be different 
from 75 km/sec/Mpc, this result should be written 


M ^ 21 Mo / Up 
L L 0 \75 km/sec/Mpc 


(15.2.11) 


(For instance, van den Bergh 6 carried out an analysis of galactic masses similar to 
Oort’s, but assumed that H 0 = 120 km/sec/Mpc, and therefore obtained for MjL 
the result 30 M 0 jL 0 .) Oort also used number counts of galaxies to estimate the 
luminosity density of the universe to be 2.2 x 10“ 10 Z/ G /pc 3 ; this value would 
scale with H 0 like LfD 3 , which according to (15.2.9) and (15.2.10) scales like H 0 , 
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so for a general Hubble’s constant Oort’s estimate of the luminosity density would 
be 

H, 


~ 2.2 x 10“ 10 L 0 I pc 


i — ) 

sc/Mpc J 


75 km/set 

The galactic mass density of the universe can now be obtained as 


Pg 


M 


= (&\ ( M I L \ 

o/ o/L o) 

= 4.6 x 10“ 9 M o/pc 3 


( — *= — v 

y75 km/sec/Mpc J 


3.1 x 10“ 31 g/cm 


Hr 


Y 


75 km/sec/Mpc J 
This is smaller than the critical density (15.2.3) by a factor 


(15.2.12) 


(15.2.13) 


— ~ 0.028 (15.2.14) 

Pc 

(More recently, Noonan 7 and S. L. Shapiro 73 have given estimates of 0.016 and 
0.010 for this ratio.) Note that such results are independent of the true value of 
Hubble’s constant. Note also that although p G and p c do not turn out to be equal, 
they are close enough to reassure us that gravitation does have something to do 
with the expansion of the universe. 

If the mass of the universe were primarily concentrated in galaxies, then 
Eqs. (15.2.14) and (15.2.6) would yield a deacceleration parameter 

~ 0.014 if p 0 fa p G (15.2.15). 

which would imply that the universe is negatively curved and open, with 
B 0 ~ H 0 ~ 1 . This value of q 0 is not in agreement with the result found from red 
shifts and luminosities, which give q 0 ~ 1, apart from possible corrections for 
evolution or selection effects. Of course, evolution or selection effects may have an 
appreciable effect on the measurements of q 0 . However, if one tentatively accepts 
the result that q 0 is of order unity, then one is forced to the conclusion that the 
mass density of about 2 x 10“ 2 9 g/cm 3 must be found somewhere outside the 
normal galaxies. But where ? 

One place to look for the missing mass is in the intergalactic space within 
clusters of galaxies. In Coma there is a rich cluster of elliptical galaxies that appears 
from its smooth shape to be gravitationally bound. If bound, its mass is given by 
the virial formula (15.2.7). The values of MjL obtained in this way range 8 from 
4 to 20 times the MjL ratio for individual elliptical galaxies. (These values are for 
H 0 = 75 km/sec/Mpc.) If there actually is 20 times more matter within clusters 
than within their individual galaxies, then the density of the universe is raised to 
near the critical density (15.2.3). In fact, an X-ray source has been discovered 83 



2 Density and Pressure of the Present Universe 


479 


filling the Coma cluster, which suggests the presence of an intergalactic gas of 
ionized hydrogen, at a temperature of order 7 x 10 7o K. However, the strength 
of this source indicates that this gas has a mass only a percent or so of the mass 
required by the virial theorem. It must be kept in mind that the Coma cluster may 
not be bound at all, 9 in which case the virial theorem overestimates its mass. Many 
rich clusters, like those in Virgo or Hercules, are highly irregular, and do not 
appear at all stable. 

If the missing mass is not within clusters of galaxies, then we must look for it 
in the space between the clusters. One reasonable requirement is that the total 
density of intercluster space must be less than the density within clusters, so that 
the clusters represent appreciable condensations. The total volume outside clusters 
is roughly 500 times greater than the volume within clusters, so the density within 
clusters is roughly 500 times (15.2.13), or about 10“ 28 g/cm 3 . Hence, even if the 
density outside clusters is an order of magnitude less than the density within 
clusters, there is still plenty of room in intercluster space for all the missing 
mass we need. 

It is possible that the missing mass might be contained in normal stars that 
happen to lie in intergalactic space (inside or outside clusters) or in dwarf galaxies 
which are too faint to have been observed. From limits on the extragalactic 
contribution to the night-sky brightness, Peebles and Partridge 9a estimate that the 
total mass density in normal stars, wherever located, must be less than 0.13 p c . 
This estimate does not rule out the possibility that the missing mass is contained 
in dark stars, with very high values of MjL, either in dwarf galaxies or in inter- 
galactic space. One immediately thinks of the “black holes” discussed in Section 
11.9. However, the estimates of galactic mass discussed above show that typical 
galaxies do not contain overwhelming numbers of dark stars, so why should dark 
stars predominate anywhere else ? Another possibility is that the missing mass is 
contained in whole galaxies that have undergone gravitational collapse. It is 
difficult to see how this hypothesis could ever be verified, except through observa- 
tions of galaxies in the throes of collapse, or through observation of the deflection 
of light rays that happen to pass close to a collapsed galaxy. 

The missing mass might be found in the form of highly relativistic particles, 
such as cosmic rays, photons, neutrinos, or gravitons. It is eas\~ to see that photons 
and neutrinos produced in ordinary thermonuclear processes cannot have an energy 
density comparable with that of ordinary nonrelativistic rest- mass, for even if the 
universe started out as pure hydrogen and has “cooked” all the way to iron, the 
energy released would be at most about 9 MeV per nucleon, which is 1% of the 
nucleon rest-mass. If highly relativistic particles dominate the mass density of 
the universe, then they must be produced in exotic processes like matter-anti- 
matter annihilation or gravitational collapse, or be left over from the early 
universe. The observed total flux density of faint discrete radio sources at frequency 
v is of the order 10 


S?(v) - 10“ 21 Wm~ 2 Hz - 1 



0.7 
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so the total energy density of the radio emission at wavelengths longer than 75 cm 
from these sources is roughly 


P radio 


'400 MHz 

Sfly) dv ^ 10“ 12 Wm" 2 c- 10“ 40 g/cm 3 

0 


The isotropic background at these wavelengths is not more than an order of 
magnitude greater. 11 For microwave and far-infrared wavelengths between 75 cm 
and 0.05 cm, the radiation flux is dominated by the 2.7°K background (see Section 
15.5), with an energy density given by the Stefan-Boltzmann law as 4.4 x 10“ 34 
g/cm 3 . The total energy density of starlight at optical frequencies is estimated 12 
to be no more than about 10“ 3 5 g/cm 3 . The observed X-ray background has a 
flux density at energy E of order 1 3 

<&{E) ~ 20 photons cm” 2 sec” 1 sr -1 keV” 1 (i£(keV))~ 2 

If this background is extragalactic, then it contains an energy density between 
0.1 keV and I MeV given by 

f* 1 MeV 

Px-ray = &n<t>{E)E dE ~ 3 x 10 3 keV cm“ 2 sec” 1 

Jo.l keV 

~ 10” 37 g/cm 3 

The energy density in y-rays above 100 MeV is estimated 11 to be less than 
3 x 10” 38 g,/cm 3 . The observed 14 energy density of cosmic ray particles is not 
more than about 10“ 3 5 g/cm 3 . 

These estimates indicate that the largest contribution of relativistic particles 
to the total cosmic energy density is provided by the 2.7°K microwave background, 
to be discussed in Section 15.5. Its density is less than one-hundredth the density 
(15.2.13) of galactic rest-mass, which justifies our tentative neglect of pressure in 
the Einstein and conservation equations. 

However, it is possible that the missing mass is made up by neutrinos or 
gravitons, 143 which interact too weakly with matter to have been detected. In 
particular, the neutrino energy density is expected to be at least comparable with 
that of microwave electromagnetic radiation, but may well be many orders of 
magnitude greater (see Section 15.6). If the energy density of the universe is 
dominated by highly relativistic particles, then the pressure is 


Vo = yr (15.2.16) 

O 

and in place of (15.2.5) and (15.2.6), the Einstein equations now give 

f- 2 = H 0 2 (q 0 - 1) (15.2.17) 

Kq 



(15.2.18) 
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where p c is the same critical density (15.2.3) as before. The critical deacceleration 
parameter, for which k = 0 and p 0 = p c> is now q 0 = 1 rather than q 0 = and 
the density required for a given q 0 and H 0 is half of that needed for a dust-filled 
universe. 

Although a photon-, neutrino-, or graviton-dominated universe cannot be 
ruled out on observational grounds at present, it is more conservative to suppose 
that the missing mass takes the form of a tenuous hydrogen gas, ionized or neutral, 
filling all space. The various methods that have been proposed to detect this gas 
depend on electromagnetic signals that reach us from cosmological distances, so 
we must defer our discussion of this gas until Section 15.4, and turn now to the 
solution of the equations of dynamical cosmology. 


3 The Matter-Dominated Era 


We have noted that the energy density of the known forms of radiation in the 
present universe is less than one-hundredth the density of rest-mass. According to 
Eqs. (15.1.22) and (15.1.23), the energy density of rest-mass scales as R ~~ 3 , and the 
energy density of radiation scales as i? -4 , so we may conclude with some con- 
fidence that the expansion of the universe has been governed by its nonrelativistic 
matter content at least since the time when R(t) was one-hundredth its present 
value. This period certainly goes back long before the emission of any of the light 
collected at Mt. Palomar, for the most distant galaxies and quasi-stellar objects 
observed have red shifts z that are much less than 100, and in fact less than 3! 
The study of the empirical relations between red shifts, luminosities, numbers, 
angular diameters, and so on, can therefore reveal only the matter-dominated era 
of the history of the universe. 

The dynamical equation governing the universe during this era is Einstein’s 
equation (15.1.20): 

R 1 + fc = p R 2 (15.3.1) 


with p taking the form (15.1.22) appropriate to a matter-dominated universe: 


P_ 

Po 


(15.3.2) 


It is convenient to make use of equations (15.2.5) and (15.2.6) to write p 0 and 
k/R 0 2 in terms of q 0 and H 0 : 


A 

Rq 


2 


8nGp 0 


(2q 0 — 1 )H 0 2 
2q 0 H 0 2 


3 
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Equations (15.3.1) and (15.3.2) then give 
R' 2 




= B 0 2 


1 - 2? 0 + 2 ?o ( ~ 
K 


R f 


(15.3.3) 

The solution may in general be expressed as a formula for t in terms of R : 

2 a 1-1/2 

t = — | | 1 - 2? 0 + -*-° I dx (15.3.4) 

x 


-4rTi-*o + ?*i 

Jo L x _ 


with £ = 0 defined as the time when R R 0 . In particular, the present age of the 
universe is 

1 ^ 


t 0 — 


H 


o jo 


1 - 2q 0 + 2q °- V2 dx 

X 


(15.3.5) 


For any positive q Q , the age of the universe must be less than the Hubble time, 


<0 < -j- (15.3.6) 

■“o 

as already remarked in Section 15.1. 

The behavior of the result (15.3.4) may conveniently be discussed under three 
special cases (see Figure 15.1): 


(A) q Q > j (k = + 1 , p 0 > p c ). It is convenient here to define a development 
angle 0 by 



t 


Figure 15.1 Solutions of Einstein’s equations for a Robertson -Walker universe with 
curvature k = + 1, k — 0, and k — — 1. The numbers along the curves k — ± 1 give 
the values of the deacceleration parameter q 0 at various epochs. 
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Then (15.3.4) gives 


H 0 t = q 0 {2q 0 - 1) 3/2 [0 - sin 0] 


(15.3.8) 


This is the equation of a cycloid; E(t) increases from zero at 0 — 0, t = 0; reaches 
a maximum at 


Om = K 


Mo 

H 0 (2q 0 - l) 3 ' 2 


mj 


2ffo-^o 
2?o - 1 


(15.3.9) 


and then returns to zero at 6 — 2n, t — 2t m . The present instant is defined by 
setting R(t) equal to R 0 in Eq. (15.3.7) ; the present value of the development 
angle 0 is then given by 

cos 0 O = 1-1 (15.3.10) 

?o 

and so the age of the universe is 


t 0 — U 0 1 ^o(%o 1) 


cos 



1 (2 ?0 - l) l/2 ' (15.3.11) 
?o 


For example, if we believe that q Q & 1 and H 0 1 = 13 x 10 9 years, then 
Eq. (15.3.10) gives 6 0 « n/2, Eq. (15.3.11) gives the age of the universe now as 


h 


~ — 1 j H 0 1 « 7.5 x 10 9 years 


(15.3.12) 


and (15.3.9) shows that the universe will reach its maximum radius R{t m ) as 2R 0 
at a time 


t m « nH 0 1 a 40 x 10 9 years (15.3.13) 


The whole life cycle of the universe takes a time 2 t m , or about 80 x 10 9 years. 
(B) 2o = i ( k = 0, p 0 = Pc)* Here (15.3.4) gives 


m 

Rn 


3B 0 t \ 2/3 


(15.3.14) 


so R(t) increases without limit. The age of the universe is given for H 0 1 « 13 x 
10 9 years as 

9 x 10 9 years (15.3.15) 


f — -H -1 
*0 — 3^0 


This is known as the Einstein-deSitter model. 

(C) 0 < q 0 < \ (k = — 1, p 0 < p c ). The results (15.3.7) and (15.3.8) can be 
applied here, except that now the development angle 6 is imaginary, 


and so 


6 = W 


H 0 t = q 0 { 1 - 2g 0 )“ 3/2 [sinh T - T] (15.3.16) 
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with 'F given by 


cosh ¥ - 1 = 


1 - 2q 0 R(t) 

% o 


(15.3.17) 


Just as in case (B), the scale factor R(t) here increases without limit; for t —>■ oo. 
we have 


Bit) 

R n 


&o(l - 2q 0 )~'e v ^ (1 - 2 q 0 ) 1/2 H 0 t 


(15.3.18) 


The present moment is defined by setting R(t) equal to R 0 in Eq. (15.3.17) : 

1 

So 

and the age of the universe is 


cosh V F 0 = 


(15.3.19) 


h = H o " 1 


(1 - 2q o r l - q 0 ( 1 - 2 q 0 )~^ 2 cosh" 1 - 1 

So 


(15.3.20) 


For instance, if we take the mass density of the universe to be that contained 
within the galaxies, then according to Eq. (15.2.15), q 0 is about 0.014, so 'Fq « 5, 
and the age of the universe is nearly equal to the Hubble time 


t 0 « 0.96 H 0 1 « 13 x 10 9 years (15.3.21) 


It is worth mentioning here that the deacceleration parameter q = — RRjR 2 
will in general change with time. For k — +1, q(t) is given by the analogue of 
Eq. (15.3.10), 

q — (1 + cos 6)~ 1 


so as 6 goes from 0 to 2n during one cosmic cycle, q rises from \ to oo and then 
drops again to For k — — 1, q(t) is given by the analogue of Eq. (15.3.19): 

q = (1 + cosh V F)“ 1 


so as *P goes from 0 to oo, q drops steadily from \ to 0. Only in the case k = 0 
does q remain constant, at the value q = Thus no special significance can be 
attached to any particular value of q 0 other than g 0 = For instance, the 
straight-line fit of luminosity distance versus red shift indicates that q 0 ~ 1 
(unless evolutionary or selection effects are important; see Section 14.6), but in a 
matter-dominated universe this must be an accident, for if at present q 0 = 1, 
then previously q 0 < 1, and in the future q 0 > 1. It is only in a radiation- 
dominated universe that k = 0 entails q 0 — 1 [see Eq. (15.217)], so that a 
deacceleration parameter of unity is stable. 

The formulas for R(t) derived above may be used to extend the phenomeno- 
logical analysis of the last chapter out to arbitrarily large red shifts. According to 
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Eq. (14.3.6), light, which arrives at time t 0 with red shift z, was emitted when the 
scale factor had the value 

R l = (15.3.22) 

1 + z 


The comoving radial coordinate of the light source is given by Eqs. (14.3.1), 
(14.3.2), and (15.3.3): 



With the aid of Eq. (15.2.5), it is straightforward to show that for all three possible 
values of Jc , the formula for r x is the same : 


z?o + (g 0 - 1)(-1 + V2 q 0 z + 1) 

tfoWa + z ) 


(15.3.23) 


The “luminosity distance,” measured by comparison of apparent with absolute 
luminosity, is then given by (14.4.14) as 


d L = R o r i( l + z) = — — - \zq 0 + (£0 “ !)(“! + q Q z + 1)] 




(15.3.24) 


In some determinations of q 0 , the observed d L versus z curve is compared with this 
exact formula, rather than the model-independent approximation (14.6.8), 

d L ~ H 0 ~'[z + ±(1 - q 0 )z 2 } (15.3.25) 

which is strictly valid only in the limit z 0. The difference between (15.3.24) 
and (15.3.25) vanishes for q 0 = 0 and q 0 = 1, and is less than 10% for 0 < z < 
0.5 (which includes all galaxies with known red shifts) and 0 < q 0 < 1.5. Hence, 
as long as q 0 is not very large, there is no substantial difference between using the 
exact formula (15.3.24) and the approximation (15.3.25) to determine q 0 . 

The “angular diameter distance” d A and the “proper motion distance” d M 
are immediately given in terms of d L by Eqs. (14.4.22) and (14.4.23): 

d A — (1 + z ) 2 d L d M ~ (1 + z) d L 
and the “parallax distance” d P is given by (14.4.10) as 

^ _ [sgp + (q 0 — 1)( — 1 + y/ZqpZ + 1)] 

H 0 [q 0 *(l + z) 1 - (2 Jo - l){ 2 g 0 + (q 0 - 1)(-1 + ^2 q 0 z + 1)} 2 ] 1/2 

(15.3.26) 
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The number counts discussed in Section 14.7 can now be expressed more 
explicitly as functionals of the source density n. Shifting variables from to z, 
and using (15.3.4), (15.3.22), (15.3.23), and (14.7.7)— (14.7.9) gives the number of 
sources with red shift less than z and apparent luminosity greater than l as 


N{<z, > l ) = 


'min (z,zj(L» 


00 dL f 47rtf 0 - 3 g 0 - 4 (l + z')- 6 (l + 2q 0 z f )~ 1/2 

0 Jo 

X [z'q 0 + too - !)(-! + ^2qyz f + 1 )] 2 n{z' , L) dz' (15.3.27) 


where 


H L ) - io 


lh 0 2 \ v1 

4nl 


+ (1 - ?o) “I + 1 + 2 


ii/o 2 Y / 2 V / 2 


4nl 




(15.3.28) 


and n(z, L) dL is the proper number density of sources at red shift z with absolute 
luminosity between L and L + dL. For radio sources with the spectrum (14.7.13), 
Eqs. (15.3.4), (15.3.22), (15.3.23), (14.7.15), (14.7.16), and (14.7.8) give the number 
of sources with red shift less than z and intrinsic power of frequency v greater than 
8 as 


N(<z, >S;v) 


’oc 

Jo 


dP 


■min (z,z Sa (P)) 


47Lff 0 - 3 g 0 - 4 (l + z')~ 6 ( 1 + 2q 0 zT 112 


X totoo + too - iX- 1 + V 2 q 0 z' + 1 )) 2 n{z', P;v) dz' 


(15.3.29) 


where z Sa (P) is the solution of the equation 


(1 + zf t ~ 1V2 (aq 0 + (Jo - 1)( — 1 + Vzjo* + 1)) = g 0 2 H 0 


(15.3.30) 


and n(z, P ; v) dP is the proper number density of sources at red shift z with intrinsic 
power at frequency v between P and P + dP. If there were no evolution of sources, 
then (14.7.18) and (14.7.19) would give n the z-dependence, 

n(z,L) = n( 0, L)(l + z) 3 
n(z, P , v) = w(0, P, v) ( 1 + z)- 


and the z' integrals in (15.3.27) and (15.3.29) could be done explicitly. However, 
we have already seen in Section 14.7 that this hypothesis does not agree with 
measurements of N(>S; v) for radio sources or N(<z) for quasi-stellar sources. 
Thus (15.3.27) and (15.3.29) can best be used to gain information about the z and L 
or P dependence of n(z, L) or n(z , P ; v). In this way, Longair 1 5 has concluded that 
either the number density of radio sources has been decreasing (apart from the 
expansion of the universe) like t~ 2 - 5 , or the mean source power has been decreasing 
like t~ 3 ' 5 . In addition, a sharp cutoff seems to be needed at early times, although 
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this conclusion is not definitely established. 16 A study of the quasi-stellar sources 
in the 3C catalogue by Schmidt 17 shows the same general features — a proper 
number density that increases with z much faster than (1 + z) 3 for 0 < z < 1, 
and falls off sharply for z > 2. It may be that this cutoff marks the epoch of galaxy 
or quasi-stellar source formation. 

The age of the universe is one more important datum that can help us to 
decide among different cosmological models. A firm lower limit on the age of the 
universe is provided by the age of the earth, determined from the relative 
abundances of radioactive elements and their decay products in the earth’s crust. 
In 1929 Lord Rutherford 18 calculated this age to be about 3.4 x 10 9 years. 
Modern studies 19 give for the age of the earth a reliable value of 4.5 x 10 9 years. 
If the Hubble time H 0 _1 is 13 x 10 9 years, then according to Eq. (15.3.11), the 
lower limit t 0 > 4.5 x 10 9 years requires that q 0 < 5. 

Radioactive dating can also be applied to our galaxy. The basic work on the 
stellar synthesis of the heavy elements is a 1957 paper by the Burbidges, Fowler, 
and Hoyle. 20 (The purpose of this paper was, at least in part, to defend the steady 
state model, by showing that the elements could be formed in stars, without 
needing a “big bang.” As discussed in Section 15.7, it is generally accepted today 
that the elements were mostly formed in stars, with the very important exception 
that helium may have been formed in the hot early universe.) According to this 
work, the isotopes of uranium were formed by a rapid process of neutron addition, 
the r -process, in an earlier generation of stars. The initial abundance ratio is 
calculated as 21 


ru 235 i 

— 1.65 + 0.15 (at formation) 

The decay rates of these isotopes are accurately known to be 
;.(TJ 235 ) = 0.971 x 10~ 9 /year 
A(U 238 ) = 0.154 x 10 -9 /year 


and at present their abundance ratio is 


U 235 ' 

u 238 , 


0.00723 


(at present) 


If all the uranium was formed promptly after the birth of the galaxy at a time t G , 
then the age of the galaxy must be 20 


In [U 235 /U 238 ]i - In [U z ^/U^°] 0 


235 /tt23) 


A(U 235 ) - A(U 238 ) 


238 ) 


~ 6.6 x 10 9 years 


^0 
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Any society that developed during an earlier epoch in the history of the galaxy 
would have found a larger proportion of the fissionable isotope U 235 than now 
available on earth, and could therefore have moved toward nuclear destruction 
even faster than our own civilization. 

An error of 20% in the estimated initial abundance ratio of U 235 and U 238 
would produce only a 4 % error in the age of the galaxy. A much greater source of 
uncertainty arises from the possibility that appreciable amounts of uranium were 
formed well after the galaxy. In this case, the galaxy must be appreciably older than 

6.6 x 10 9 years. In order to settle this question, other abundance ratios have been 
used in conjunction with the ratio ofU 235 to U 238 , the duration of the period of 
element synthesis being taken as a free parameter along with the time when this 
period began. Using the Th 232 /U 238 and U 235 /U 238 ratios, Fowler and Hoyle 21 
estimated that the age of the oldest r-process elements is between 9.6 x 10 9 and 

15.6 x 10 9 years. Clayton 22 has included the Re 187 /Os 187 ratio in his analysis, 
with results similar to those of Fowler and Hoyle. However, chemical separation 
effects may be important here, so these results are subject to possible large 
systematic errors. Dicke 23 has persistently argued that the bulk of the r - process 
elements were produced within a few hundred million years of the formation of 
the galaxy, in which case the age of the galaxy would be close to 7 x 10 9 years. 
It is safe to conclude that the galaxy and hence the universe is at least 7 x 10 9 
years old, so that q 0 < 2.3 for H Q ~ l ~ 13 x 10 9 years, but radioactive dating 
cannot yet be considered to give a precise age for the galaxy. 

It is also possible to estimate the age of the galaxy by study of its globular 
clusters. These are large compact clusters containing thousands of individual 
stars, so their Hertzsprung- Russell diagrams (the relation between luminosity and 
spectral type) can be determined with some precision, Also, the low metal content 
of the globular cluster stars indicates that they belong to the first generation of 
stars to condense out of the protogalaxy (called Population II; see Section 14.5) 
and are therefore among the oldest objects in our galaxy. If all stars in a globular 
cluster have the same initial chemical composition and age, and differ only in mass, 
then these stars must fall on a locus in the Hertzsprung- Russell diagram, whose 
shape depends only on the age and initial chemical composition. By comparing 
computer solutions of the equations of stellar evolution with the densities of stars 
in the observed Hertzsprung-Russell diagrams for a large number of globular 
clusters, Iben 24 has deduced cluster ages ranging from 8 x 10 9 to 18 x 10 9 
years, corresponding to initial helium abundances (by mass) ranging from 33 to 
24%. It is not ruled out that all clusters have the same age, which would most 
probably lie in the range from 9.5 x 10 9 to 15.5 x 10 9 years. If the age of the 
universe is really greater than 9 x 10 9 years, and if the Hubble time H 0 ~ 1 is 
13 x 10 9 years, then q 0 must be less than and the universe must be negatively 
curved and infinite, as also indicated by the mass-density estimates discussed in 
the last section. 

It would certainly be premature to reach any definite conclusions about the 
curvature of space from these estimates of the age of the universe. However, the 
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fact that the uranium and globular- cluster ages are roughly comparable with the 
Hubble time H 0 ~ l provides a strong argument that the observed correlation of 
red-shift with luminosity-distance really does have something to do with the 
evolution of the universe. 

The explicit solution for R(t) in the matter- dominated era can be used to 
illustrate the horizons that limit our vision of the universe. The speed of light sets 
an upper limit to the local propagation velocity of any signal, so at a given time t 
an observer at r = 0 can receive signals emitted at time t t only from radial co- 
ordinates r < r lt where is the radial coordinate from which light signals emitted 
at time t x would just reach r = 0 at time t. According to Eq. (14.3.1), is deter- 
mined by the formula 


Vl dr 
0 \/l — hr 2 



(15.3.31) 


If the ^-integral diverges as — ► 0, then it is in principle possible to receive signals 
emitted at sufficiently early times from any comoving particle (such as a “typical 
galaxy”) in the universe. On the other hand, if the £ '-integral converges for t t — > 0 
(or, in singularity-free models, for t x —*■ —00), then our vision is limited by what 
Rindler 25 has called a 'particle horizon : It is possible to receive signals at time t 
only from comoving particles that lie within the radial coordinate r H (t), where 

’ r « (r) dr f ' dt' 

.0 V 1 — Ter 2 * 0 R(t f ) 


The proper distance (14.2.21) of this horizon is 


— -^(0 


0 V 1 — hr 2 



(15.3.32) 


It is easy to see from Eq. (15.1.20) that a particle horizon will be present if p 
grows faster than R~ 2-e as R -> 0, as would generally be expected. In particular, 
if the greatest part of the ^'-integral comes from the matter- dominated era, then 
(15.3.4) can be used to express dt ' in terms of R{t') and dR(t f ), and we find 


cos -1 {l — (^° ~ 

B 0 H 0 j2q 0 - 1 1 q 0 R o J 

2 /R(t)\ 312 
R o \ R o ) 

M eosh- h + (l -W ) 

- 2 ?o l IJo ^° 


% > i (*=+!) 

?o = 2 ( h = °) 

?0 < i (* = - 1 ) 

( 15 . 3 . 33 ) 


d H (t) = 
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In the early part of the matter-dominated era, R was much less than R 0 , so the 
particle horizon was at a small proper distance : 


d H (t) -> H 0 1 


% 

2 


V 2 /R \ 3 ' 2 t 

VV ” 3 


(15.3.34) 


For q Q < j-, R(t) increases without limit as t — ► oo, so d H (t) increases faster than 
R{t), and the particle horizon will thus eventually expand to include any given 
comoving particle. For q 0 > -J, the universe is spatially finite, with a circumference 
given by Eq. (14.2.4): 

L(t) = 2nR(t ) (15.3.35) 

Looking out in any given direction, we can see comoving particles out to a fraction 
of this circumference, given by Eqs. (15.3.33) and (15.2.5) as 


^h(0 _ Jl cos - 1 Jl — ~ 1)^(0 

L(t) ~ 2n \ q 0 R 0 


(15.3.36) 


When R{t) expands to its maximum value (15.3.9), this fraction will be J, and we 
shall be able to see all the way to the “antipodes.” However, this fraction remains 
less than unity until R(t) shrinks once again to zero, so we shall not be able to see 
all the way around the universe until then. If q 0 = 1 and H 0 ~ x = 13 x 10 9 
years, then the present circumference (15.3.35) is given by Eq. (15.2.5) as 
82 x 10 9 light years and the particle horizon is at one-quarter this distance, or 
20 x 10 9 light years. 

Just as there are some comoving particles that we cannot now see, there may 
in some cosmological models be events that we never shall see. An event that 
occurs at at time will become visible at r = 0, at a time t given by Eq. (15.3.31). 
If the t' -integral diverges as t oo (or at the time of the next contraction to 
R = 0), then it will in principle be possible to receive signals from any event if we 
wait long enough. On the other hand, if the ^-integral converges for large t , then it 
will only be possible to receive signals from events for which 

f ri dr p max dt ' 

Jo — hr 2 Jo R(C) 


where £ max is either infinity or the time of the next contraction to R = 0. Rindler 2 5 
calls this an event horizon. For q 0 < \ or q 0 = R(t) grows as t ->• oo like t or 
t 2/3 , so the t r -integral diverges at t = oo, and there is no event horizon. For 
g 0 > J, the ^-integral bon verges at £ max , so there is an event horizon: The only 
events occurring at time t 1 that will be visible before the collapse of the universe 
are those within a proper distance 




Vmax dt r 
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2n — cos 
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(2q 0 - 1 )£(M 


(15.3.37) 
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If q 0 = 1 and # 0 _1 — 13 x 10 9 years, then the only events occurring now that 
will ever become visible to us are those that occur within a proper distance of 
61 x 10 9 light years. 

4 Intergalactic Emission and Absorption Processes 

Up to now we have dealt only with light signals, which are emitted by distant 
discrete sources, and propagate to us through essentially empty space. However, 
we saw in Section 15.2 that Einstein’s equations require a cosmic energy density 
(if H 0 ~ 75 km/sec/Mpc and q 0 « 1) about equal to 2 x 10“ 29 gm/cm 3 , which is 
some 70 times larger than the observed density of galactic mass. If the missing mass 
takes the form of an ionized or neutral gas filling intergalactic space, then we can 
hope to measure the mass density, and distinguish between cosmological models, 
by observing the absorption or time delay of light as it passes through the inter- 
galactic gas, or by observing the background radiation emitted by this gas. The 
absorption of light signals, and the emission and absorption of background radia- 
tion, become even more important when we turn our attention back to the early 
universe, when the density and opacity of matter was enormously greater than it is 
today. 

To lay a foundation for our treatment of these problems, let us first consider 
the effects of absorption and emission on a ray of light, which leaves a source at 
time with frequency v 1; and arrives at the earth at time t 0 . If no emission occurs 
in the intervening medium, then the loss of flux of the light ray is given by an 
equation of form 

N(t) = (15.4.1) 

where N is the photon number density in the light ray, and A(v, t) is the absorption 
rate (per unit proper time) of light with frequency v. [It is implcitily understood 
here that at time t the photons in the ray have red-shifted frequency )/!?(£).] 

The solution is usually written in the form 

N{t 0 ) = (15.4.2) 

where % is the optical depth : 

t = ° A (v, , t) dt (15.4.3) 

J„ V W) ) 

Now suppose that the medium itself, apart from effects of the light ray, isotropically 
emits F(v, i) photons per unit proper volume, per unit proper time, and per unit 
frequency interval at frequency v. These photons do not become part of the light 
ray, but join the isotropic background radiation, to be discussed below. However, 
Bose statistics require that photons will be added to the light ray through the 



492 


IS Cosmology : The Standard Model 


process of stimulated emission , 26 at a rate, per photon in the light ray, given 
rigorously by 


n(v, t) 


[M 

871V 2 


(15.4.4) 


In place of Eq. (15.4.2), the rate of change of photon number density in the light 
ray is now given by the formula 


Nit) = -A 


v. . t\ Nit) + a 

, m J 


wo 

E(t) 


t \ Nit) 


(15.4.5) 


and so the optical depth in Eq. (15.4.2) must now be written as 


T = 




A v 


-R(*i) 

R(t) 


t - Q 




dt 


(15.4.6) 


If the medium is in thermal equilibrium, not necessarily in equilibrium with 
radiation, then Q and A are related by the Einstein formula 27 : 


Q(v, t) — exp 


hv 

, A (v,t) 

L m) 


(15.4.7) 


where h is Planck’s constant, Jc is Boltzmann’s constant, and T{t) is the temperature 
of the medium at time t. [This result simply follows from the principle of detailed 
balance. The rate of spontaneous emission of photons per unit volume of phase 
space in any given transition within the medium is equal to the rate of absorption 
of photons in the inverse transition, times the ratio of the populations of the upper 
and lower states, which is simply given by the Boltzmann factor exp ( — hv/JcT). 
This factor depends only on v and T, so the total rate of spontaneous emission per 
unit volume of phase space, which according to (15.4.4) is just £2, is equal to the 
total absorption rate A, times exp( — hv/kT).] The optical depth is then 


T 



exp 


m \ dt 

kT(t)E{t)J J \ E(t ) J 


(15.4.8) 


Even if the medium is not strictly in thermal equilibrium, it is often a good 
approximation to use (15.4.7) and (15.4.8), with T(t) taken as an effective tempera- 
ture . Normally, T is positive, so e~ x < 1, and a light ray is weakened as it passes 
through the medium. However, it is sometimes possible to have a population 
inversion in the medium, with a negative effective temperature. In this case % is 
negative, so e~ x > 1, and a light ray is amplified by the medium. Such maser 
phenomena have been detected within our galaxy, but not yet in intergalactic 
space. 

Apart from the images of discrete sources, there is also an isotropic background 
of radiation produced by the universe as a whole. Let Jf (v 0 , t) dy 0 be the number 
density of photons at time t , which at time t 0 would have frequency between v 0 
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and v 0 + dv 0 . If no absorption or emission occurs, then by the same reasoning 
that led to Eq. (14.2.17), the time dependence of Jf (v 0 , t) would simply be given 
by a factor R~ 3 (t), arising from the general expansion of the universe. In order to 
calculate the rate of change of Jf (v 0 , t)R 3 (t) owing to spontaneous emission pro- 
cesses, we note that photons, which at time t 0 are in the frequency range from v 0 
to v 0 + dv 0 , are in the frequency range from v 0 R(t 0 )IR(t) to (v 0 + dv 0 )R{t 0 )jR{t) at 
time t ; hence the rate of change of the number of photons Jf (v 0 , t)R 3 (t) dv Q in a 
proper volume R 3 (t ) and a frequency interval dv 0 at t 0 is 

r t Wo) q dv A 

\ m ) v *(*) / 

when r is again the rate of spontaneous emission per unit proper volume and per 
unit frequency interval, now including all discrete sources as well as the medium 
itself. The rate of change of Jf (v 0 , t)R 3 {t) dv 0 due to induced emission and 
absorption is 


( n ( v ° fp? ’ *) " A ( v ° foo ’ t )) jr(Vo ’ t)R3(t) dv ° 

just as in Eq. (15.4.5). Hence the effect of spontaneous and induced emission and 
absorption is to give Jf (v 0 , t)R 3 (t) the rate of change 

I {^(Vo. t)R\t)} = r ^v 0 , j R 2 m(t 0 ) 

+ ( n ( v ° ’ f ) “ A ( v ° ’ ')) - r(v °’ t)R3{t) 
Using (15.4.4) for F, the solution is 

^■(vo, t)R 3 (t) 



with arbitrary. The first term just gives the number of photons left over from 
before t lt and the second gives the number of photons emitted since t 1 : in both 
cases the exponential factors represent the subsequent effects of absorption and 
induced emission. This result simplifies if we take t as the present instant t 0 , and 
take sufficiently far in the past so that essentially all the background radiation 
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was emitted since then, t he present number density of photons per unit frequency 
interval is then 


^yo^'o) — ^ ( v o> ^o) 


= 871V 


, 2 
0 




x Cl ( v 0 , t] dt 

\ m ) 


(15.4.9) 


If the medium is in thermal equilibrium, then (15.4.7) can be used to express Q in 
terms of A, and the present photon number density becomes 



hv 0 B(t 0 ) 

kT(t)R(t) 

- exp ( 


A v 0 


m 0 ) 

R(t) 


hv 0 R(t 0 ) \ ^ / R(t q) 
lcT{t r )R{t') )\ \ ° R(t') 



dt 


(15.4.10) 


We have not yet considered the effects of photon scattering. In calculating the 
optical depth of a discrete source, any sort of scattering will remove photons from 
the light rays, with no stimulated emission to return photons to the ray. Instead 
of (15.4.8), the optical depth will then be 


T 



A / Rid i) 

kT{t)R(t)j] \ 1 R(t) 


dt 


+ 



m x ) 

R(t) 


t dt 


(15.4.11) 


where ^ (v, t) is the scattering rate for a photon of frequency v at time t. It is much 
more difficult to take into account the effects of photon scattering on the isotropic 
background, because each scattering contributes a photon to the background for 
every one it takes away. The one case that is easily dealt with is Thomson scatter- 
ing, in which both hv and IcT are much less than the charged particle mass. In such 
scatterings, there is no change in the photon frequency, so the scattering simply 
has no effect on the isotropic background. In our calculations of the isotropic 
background, we shall have to assume that scattering, except for Thomson scatter- 
ing, is much rarer than absorption. (However, note that resonant scattering must 
be counted as absorption, if the mean lifetime of the resonant state is long compared 
with the mean free time of the particles in the medium.) 

Let us now apply this formalism to the problem of detecting the ‘ 'missing 
mass” in intergalactic space. If the intergalactic medium consists of a rarefied gas 
of neutral atoms, such as hydrogen, then it will absorb radiation strongly at 
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various discrete frequencies, corresponding to various transitions between atomic 
states. For simplicity, let us suppose that all absorption takes place within a small 
frequency interval centered on a single absorption frequency v a . The absorption 
rate is then of the form 


A(v, t) = n{t)o a (v) 

where n[t) is the number density of atoms at cosmic time t, and & a {v) is the absorp- 
tion cross-section, assumed to be negligible except within a sharp peak at v a . 
According to (15.4.8), essentially all absorption of a ray of light, which leaves a 
source at time t 1 with frequency > v a and arrives here at time t 0 with frequency 
v 0 < v a , will take place at a time t a such that 


- 


v^R(t-d) v 0 R(t 0 ) 


(15.4.12) 


so that the optical depth is 




mt a )) u \ mj 

By changing variables from t toy = v 1 i?(i 1 )/i^(i), we find 


A at 


t ~ nit A 


1 _ eXD ( 

m ;. 


m a ) 


where 


I a = 


cr(v) dv 


(15.4.13) 


(15.4.14) 


the integral being taken over a range of frequencies just large enough to include 
the whole absorption line. The choice of a particular cosmological model is neces- 
sary here only in order to determine the Hubble “constant” RjR at time t a ; 
according to Eqs. (15.3.3), (15.4.12), and (15.3.22), this is 


R(K) _ *(*o) jp r 
Wa) ~ Wa) ° L 

= (-W 

\ v o / L 


1 - 2? 0 + 2 q 0 


«(*p) 

m] 

1/2 


1/2 


1 — 2g 0 + 2 q 0 — 


(15.4.15) 


Using (15.4.15) in (15.4.13), we see that the optical depth at a received frequency 
v 0 is 


T(V„) = ^ _ 

V a U 0 


1 — exp ( — 


hv. 


kT{t aJ 


1-2 q 0 + 2 q 


v." 

0 

V oJ 


1/2 


(15.4.16) 
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Of course, this result applies only when (15.4.12) can be satisfied along the light 
path, that is, for received frequencies in the range 


Vq 

(1 + z) 


<v 0 <v a 


(15.4.17) 


where z = Vj/Vq — 1 is the red shift of the source. Therefore we expect an absorption 
trough in the received signal for v 0 within this range. The optical depth, caused by 
a single absorption line at frequency v a , vanishes for v 0 < v 0 /(l + z), jumps up 
steeply at vj{ 1 + 2) to a value 


T 


Vq 

.(1 + z) 


+ 


n(h)I a 
H 0 (l + z) 


1 — exp 




[1 + 2 q 0 z]- 1/2 (15.4.18) 


then varies more or less smoothly until just below v a> where it takes the value 


T(v„-) = H 0 1 n(t 0 )I a 


1 — exp 


foq 

JcT(t 0 ) 


(15.4.19) 


and finally drops down sharply to zero for v 0 > v a . 

At the same time that the intergalactic medium is absorbing light signals, it 
will also be emitting isotropic background radiation. If the medium has an absorp- 
tion line at frequency v a , then it will have a single emission line at the same 
frequency, and the emitted radiation will be observed at red-shifteu frequencies 
v 0 < v a . Using (15.4.10), and following the same reasoning that led to (15.4.16), 
we find that the present number density of background photons per unit frequency 
interval at a received frequency v 0 is 


( V Q’ ^0) 


87 iVo 3 n{ta)Ig 
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exp 


■-^ATl - 2 q 0 + 2 
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1/2 


(15.4.20) 


with I a and t a given by (15.4.14) and (15.4.12), respectively. The background 
density n varies with v 0 more or less smoothly up to just below the frequency v a , 
where 

V(v„-, t 0 ) = Snv a 2 H 0 ~ 1 n(t 0 )I a exp (15.4.21) 

and then drops down sharply to zero for v 0 > v a . 

The detailed v 0 -dependence of the optical depth (15.4.16) and the background 
density (15.4.20) depends on the history of the number density n(t) and the 
temperature T(t ). If the atoms that absorb and emit the line at v a are neither 
created nor destroyed during the interval from t a to t 0 , then 

»(U = n(to) = «(<o) 



(15.4.22) 
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as in Hiq. (14.2.17). In particular, for v a = v 0 (l + z), we have t a = t t , so 

n(t{) = n(t 0 )[ 1 + ^] 3 (15.4.23) 

If the ‘ ‘missing mass” consists of intergalactic neutral hydrogen atoms, then for 
q 0 ~ 1 and H 0 ~ 75 km/sec/Mpc, these atoms must have a mass density p 0 ~ 
2 x 10 -29 g/cm 3 (see Section 15.2) and hence a number density 


n(t 0 ) = — ~ 1.2 x 10 5 cm 3 

m ii 


(15.4.24) 


Whether or not we believe this particular estimate, a number density of order 
10 ~ 5 cm' 3 can serve as a specific target at which to aim in attempts to detect the 
intergalactic medium. 

The most prominent radio-frequency absorption line in atomic hydrogen is the 
21 -cm hyperfine transition, produced by a flip in the proton and electron spins in 
the Is state, from total spin zero to total spin unity. The frequency of this line is 
v a = 1420 MHz, corresponding to a temperature hvjk = 0.068°K, which almost 
certainly is much less than the “spin temperature” of whatever intergalactic 
hydrogen may exist. Hence the correction factor owing to stimulated emission can 
be approximated here as 


1 — exp 




0.068°K 

TcTj 

kT 

T 


(15.4.25) 


The absorption coefficient (15.4.14) has the value 

1 21 an = 2.73 X 1CT 23 cm 2 (15.4.26) 


In 1959 an ingenious method for detecting weak absorption effects near 21 cm was 
suggested by Field , 28 and used by him to search for such effects in the spectrum of 
the radio galaxy Cygnus A. This source has red shift z — 0.056, and so according to 
(15.4.17) an absorption trough should occur in the range of observed frequencies 
from 1342 to 1420 MHz. This range is sufficiently narrow so that the optical 
depth throughout the absorption trough should be well approximated by the value 
(15.4.19). Together with (15.4.25) and (15.4.26), this gives a density-temperature 
ratio (c.g.s. units) 


nH ^°l ~ ~ 4.4 x io 5 t cm 3 deg 1 — 

T(t 0 ) hvJa c |_75 km/sec/Mpc 


(15.4.27) 


Field 28 could detect no absorption trough, and estimated that t < 0.0075, which 
with H 0 = 75 km/sec/Mpc implies an upper limit n H {t Q )jT(t Q ) < 3.3 x 10 ” 7 
cm - 3 deg - 1 for the present density-temperature ratio of neutral hydrogen atoms. 
This experiment has since been repeated by Field 29 and others , 30 but no absorp- 
tion trough has yet been definitely established in this frequency range. A recent 
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measurement by Penzias and Scott 30 gives t < 5 x 10 which with H 0 = 
75 km/sec/Mpc implies 

< 2.3 x 10“ 8 cm - 3 deg" 1 (15.4.28) 

n*o) 

It seems reasonable to suppose 31 that the effective spin temperature of inter- 
galactic hydrogen should be about equal to 2.7°K, the temperature of the back- 
ground microwave radiation (see Section 15.5). In this case, (15.4.28) imposes 
an upper limit on the intergalactic hydrogen number density 

n H (t 0 ) < 6 x IQ" 8 cm -3 (15.4.29) 


which is 200 times smaller than the expected value (15.4.24). If intergalactic 
hydrogen really makes up the missing mass, then (15.4.28) requires its temperature 
to be over 500°K. 

Efforts have also been made to detect red-shifted 21 -cm absorption effects in 
the spectra of the quasi-stellar objects 3C191, PKS 1116 + 12, and 3C287. JSTo such 
effects were seen. 32 

One way to set an upper limit on the density of neutral hydrogen in inter- 
galactic space, which would not depend on any assumed upper limit for the spin 
temperature, is to search for the red-shifted 21 -cm radiation that would be 
emitted. There is in any case a microwave radiation background, so the additional 
background caused by 21 -cm emission would have to be detected by looking for a 
step in the photon number density at v 0 = v a = 1420 MHz. According to Eq. 
(15.4.21), the number density per unit frequency interval Jf should be larger just 
below than just above v a by the amount (in c.g.s. units) 

AJf = Snv a 2 H 0 ~ 1 c~ 1 I a n H (t 0 ) (15.4.30) 


[It is assumed here that T p hvjk = 0.068°K. If this is not the case, then the 
absence of an absorption trough below 21 cm sets an even smaller upper limit on 
n H (t 0 ) than (15.4.28) or (15.4.29).] Usually measurements of background radiation 
are reported in terms of an equivalent “antenna temperature” T A , defined by the 
Rayleigh- Jeans relation 


so (15.4.30) gives 


Jf = Snv a kT A h l c 3 


(15.4.31) 
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(15.4.32) 


where A T A is the step in antenna temperature at 1420 MHz. Penzias and Wilson 33 
report that AT a < 0.08°K, so that if H 0 = 75 km/sec/Mpc, then 

n H (t 0 ) < 3 x 10“ 6 cm -3 (15.4.33) 


This upper limit is only four times smaller than the expected value (15.4.24), so it 
is not yet entirely ruled out that the missing mass may consist of hot intergalactic 
atomic hydrogen. 
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One other prominent absorption line that has been used in the search for 
intergalactic hydrogen is the Lyman a line of hydrogen, produced by an 
electronic transition from the 1 6- to the 2p state. This line has a wavelength 
X = 1215 A, which lies in the ultraviolet, so normally Lyman a would not pene- 
trate the earth’s atmosphere. However, a photon that has X = 1215 A when 
1.5 < z < 6 will be shifted into the visible “window” between 3000 A and 
7000 A when it reaches the earth, and can therefore be detected by ground-based 
astronomers. Thus intergalactic hydrogen atoms might be detected by observing 
absorption effects in the spectra of quasi-stellar objects with z > 1.5 at emitted 
frequencies above Lyman oc. 

There are several reasons why Lyman a absorption provides a more sensitive 
test for the presence of intergalactic hydrogen atoms than does 21 -cm absorption. 
First, the absorption coefficient (15.4.14) is much larger here: 


4.5 x 10" 18 


(15.4.34) 


Also, the frequency v a is 2.4' 


10 15 Hz, corresponding to a temperature hv a /k 


118,000°K, and since we are now assuming that the ionization is small, we 
necessarily have 

^ > 1 (15.4.35) 

kT 

The factors [1 — exp (—hvJkT)] in Eqs. (15.4.16), (15.4.18), and (15.4.19), 
which represent the suppression of absorption by stimulated emission, can then be 
set equal to unity. Finally, quasi-stellar object spectra often show Lyman a as an 
emission line, so if there is any appreciable neutral hydrogen nearby, then the blue 
wing of this line ought to be conspicuously suppressed by a factor e~ T , with t 
given by (15.4.18), (15.4.34), and (15.4.35), in c.g.s. units, as 


J + \ = n H (t,)cI a 

b + z ) H 0 ( 1 + z)(l + 2 q 0 z) 


5.5 x 10 1U 

1 + z)(l + 


75 km/sec/Mpc 


15.4.36) 


Note that the suppression of the blue wing of Lyman a measures the neutral 
hydrogen density near the time of emission, not the present. (Also, if quasi-stellar 


objects ar< 


local phenomena, then no suppression is to be expected.) 


Attempts to detect Lyman a absorption effects have centered on the quasi- 
stellar object 3C9, with z = 2.012. The first measurements were made in 1965 by 
Gunn and Peterson ; 34 they found a 40% depression in the blue wing of the Lyman 


ot emission line, which would give x ( — h ) ~ 0.5. Taking q 0 = \ and H 0 1 — 

\1 + z J 

10 10 years, they concluded that the number density of neutral hydrogen atoms at 
? ~ 2 is about 6 x 10“ 11 cm -3 . A subsequent photoelectric measurement by 
Oke 3 5 showed no depression in the blue wing of the 3C9 Lyman a emission line, 
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and was interpreted by the Burbidges 40 as showing that t < 0.05. With q 0 = 1 
and H 0 = 75 km/sec/Mpc, this gives 

n H (z ~ 2) < 6 x 10' 12 cm' 3 (15.4.37) 


If (15.4.23) is to be believed, then the “expected” value of n H at z = 2 is 27 times 
larger than (15.4.24), so the observed upper limit (15.4.37) is 8 orders of magnitude 
smaller than expected ! 

It is conceivable that the lack of neutral hydrogen near 3C9 is due to ionizing 
radiation produced by 3C9 itself. For this reason, it is important to look for an 
absorption trough extending from the Lyman a emission line toward shorter wave- 
lengths, which would be due to Lyman a absorption of light at great distances 
from 3C9. [See Eqs. (15.4.16) and (15.4.17),] No such trough was found by Oke, 35 
and the observations of Wampler 37 show only a slight depression, with t(v x ) no 
greater than about 0.3. 

Other attempts have been made to detect intergalactic absorption at ultra- 
violet wavelengths in the spectra of quasi-stellar objects, with no better success. 
Field, Solomon, and Wampler 38 have looked for absorption effects due to inter- 
galactic molecular hydrogen in the spectrum of 3C9, and concluded that the inter- 
galactic mass density of molecular hydrogen is less than about 10' 32 g/cm 3 . 
There is also a possibility that the intergalactic hydrogen is concentrated in clouds, 
in which case the absorption of Lyman a should show up in quasi-stellar source 
spectra as a set of more or less broad lines, one for each cloud along the line of sight. 
No such effects have been found in analyses of quasi-stellar object spectra by 
Bahcall and Salpeter, 39 Wagoner, 40 and Peebles, 41 and Peebles concludes that the 
overall density of neutral hydrogen atoms, even if concentrated in clouds, must be 
less than a few percent of the expected value (15,4.24). Recently three or four quasi- 
stellar objects have been found with multiple absorption red shifts very much 
smaller than the red shift of the corresponding emission line, 42 as if the absorption 
occurred along the line of sight far from the source. However, this phenomenon is 
rare, and could well be explained by processes occurring within the quasi-stellar 
source itself. 43 

If the missing mass is not to be found in the form of neutral hydrogen atoms 
or molecules, then perhaps it consists of an ionized intergalactic hydrogen plasma, 
possibly with small admixtures of heavier ions. The high degree of ionization 
indicated by the absence of Lyman a absorption in quasi-stellar object spectra 
could be explained in terms of a balance between collisional ionization and radia- 
tive recombination, provided 4 3 a that the temperature at z ~ 2 is above 10 6o K. 

Such a hot gas would produce X-rays through the familiar bremsstrahlung 
associated with thermal electron-ion collisions, at a rate per unit volume and per 
unit frequency interval given (in c.g.s. units) by the formula 


r(v) = 


~32nge 6 Z 3 n l 1 ~ 

in T /2 r 

3 hvc 3 

ZkTm e 3 J P 


— hv 
IcT _ 
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where n i is the ion number density, Z 3 is the mean cubed atomic number, and g is a 
“Gaunt” correction factor, estimated 4315 to lie between \ and 2 near the peak of the 
photon spectrum. Field and Henry 4 3c have calculated the resulting cosmic X-ray 
background, under the assumption that the missing mass consists of H and He 4 
(10% by number) which is suddenly heated to an initial temperature T 0 (between 
10 4o K and 10 10 o K) at R between \Rq and yqI? 0 , and then cools adiabatically, 
with T oc R~ 2 . The spectrum falls off rapidly for hv > kT 0 , while the interstellar 
medium within our galaxy is opaque to soft X-rays with hv < 0.1 keV, so an inter- 
galactic medium should produce an observable X-ray background only if its initial 
temperature T 0 is above 10 6 o K. 

In fact, rocket observations (recently summarized by Brecher and Burbidge 44 ) 
do reveal the existence of a diffuse X-ray and y-ray background extending at least 
from 250 eV to 100 MeV. This background is highly isotropic , 443 suggesting that 
it is at least in part of extragalactic origin. However, until recently the X-ray 
background was not generally interpreted as providing evidence that the missing 
mass consists of ionized intergalactic hydrogen. One reason is that estimates of the 
X-ray intensity were lower than at present, while Field and Henry 4 3c had assumed 
a rather large value for the Hubble constant and hence a large value for the 
missing mass density, so that it was difficult to construct any thermal history for 
the intergalactic medium, with a temperature high enough to be consistent with 
the Lyman a and 21 cm absorption and 21 cm emission results discussed above, 
and yet low enough not to produce more soft X-rays than observed. Also, after 
the discovery of the cosmic microwave background, it appeared that the X-ray 
background might be explained as due to the inverse Compton scattering process 
discussed at the end of the next section. 

The origin of the cosmic X-ray background has been reconsidered very recently 
by Cowsik and Kobetich. 44b They find that the X-ray spectrum below 1 keV can 
be accounted for by the inverse Compton effect, while above 100 keV the spectrum 
is consistent with that expected from the production of y-rays by white dwarfs. 
However, between 1 keV and 100 keV there is an excess “kink” in the X-ray 
spectrum, which can be roughly fit w 7 ith the flux per unit energy interval : 

^excessW — 3 keV cm " 2 ster " 1 sec -1 keV " 1 

( E \ 

x exp 

\ 30 keV ) 

This spectrum is just what we should expect for thermal brernsstrahlung from 
intergalactic hydrogen, with an effective temperature of 3.3 x 10 8o K and an 
integrated squared ion density J n 2 ds of order 10 17 cm" 5 . Such a medium could 
furnish the missing mass, especially if H 0 has a rather low value, near 50 km/sec/ 
Mpc. However, Field 433 points out that the excess X-ray background could also 
be produced by “clumped” matter, such as ionized gas within clusters of galaxies 83 
(see Section 15.3) in which case the required mean mass density is reduced by the 
ratio of the mean and r.m.s. densities, and fails below the critical density p c . 
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These considerations have prompted a number of recent studies 45 of the 
thermal history of the intergalactic medium. According to one interesting sugges- 
tion of Rees, 46 the intergalactic medium is supposed to have become ionized at a 
time corresponding to a critical red-shift between 2 and 3. In this case, the 
absorption of light by neutral hydrogen at the Lyman a ,/?,... lines and in the 
Lyman continuum would reduce the luminosity of quasi-stellar objects with 
z > z c , particularly at the blue end of the spectrum, so that the notable lack of 
quasi-stellar objects with 2 > 3 could be explained as a selection effect, quasi- 
stellar objects being usually identified by their blue appearance in Palomar Sky 
Survey plates. If the rapid increase of quasi-stellar object density with z found by 
Schmidt 17 (see Section 15.3) really continues beyond z = 2, then these sources 
may well provide the energy that ionizes the intergalactic hydrogen at z = z c . 
Alternatively, it may be that the quasi-stellar objects are formed at z — z c , and 
that it is this formation process that ionizes the intergalactic medium. Either way, 
it seems likely that something peculiar happened on a cosmic scale at z ~ 3. 

The effects of ionized intergalactic hydrogen on the propagation of light signals 
can be readily calculated without detailed assumptions about the plasma tempera- 
ture. As long as hv and hT are much less than 1 MeV, the chief effect of the plasma 
on light signals is an isotropic elastic scattering, with cross-section per electron 
given by the Thomson value, a T = 0.6652 x 10“ 24 cm 2 . The optical depth can 
then be calculated from (15.4.11) neglecting the first term, and setting the 
scattering rate equal to 


£ (v. t) = a T n e (t) (15.4.38) 

where n e is the number density of electrons, equal to the number density of protons, 
Suppose that the whole missing mass consists of ionized hydrogen; then (15.1.22). 
(15.2.6), and (15.2.3) give 


n u\ ~ ^ 

m H 4nGm H \ R(t) J 


Also, (15.3.3) gives 




)) 


-1/2 


(15.4.39) 


(15.4.40) 


Using (15.4.38), (15.4.39), and (15.4.40) in (15.4.11), we find the optical depth to be 


T = 


3q 0 H 0 a T R 2 (t 0 ) 

4nGm„ 


,S( %- >ri -* 0 + * 0 p£ 

R(ti) L ^ K 


R(t 0 )\ 

* L 


-1/2 


dR 


For a source with red shift z = R(t 0 )jR(tf) — 1, the optical depth is then 


47 


t( 2) = — [(3?o + ?oZ - 1)(1 + 2 ?„z) 1/2 + 1 - 3<7 0 ] (15.4.41) 

1o 
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where (c.g.s. units) 


= EoW = 0 035 / Eo \ ( 15 . 4 . 42 ) 

4nGm H y75km/sec/Mpc J 

The quasi-stellar objects with z = 2 do not seem particularly faint, so presumably 
t (2) is less than about unity; with H 0 = 75 km/sec/Mpc, this gives q 0 < 10. 
With q 0 = 1 and H 0 = 75 km/sec/Mpc, the optical depth is less than unity out to 
2 = 6, so Thomson scattering probably does not play an important role in studies 
of the quasi-stellar objects. 

An intergalactic medium of ionized hydrogen would not only scatter radio 
signals ; it would also delay 48 them. The group velocity of an electromagnetic wave 
of frequency v in an ionized gas with electron number density n e is given by 49 

( v 2 \ 1/2 

P = M - (15.4.43) 


where v p is the plasma frequency 
fe 2 r) \ 1/2 

v = I £ ) = 8.97 x 10 3 Hz (w e [cm 3 ]) 1/2 (15.4.44) 

\m e nj 

(Again, this is valid only if hv and IcT are both much less than the electron rest- 
energy.) In a locally inertial coordinate system we have |dx| = p dt, so the in- 
variant proper time is 

dx 2 - (1 - P 2 ) dt 2 


Equating this to the Robertson- Walker line element with dO — d(j) = 0, we have 
(1 — p 2 ) dt 2 — dt 2 — 


[2 R 2 {t) dr 2 


1 — Icr 


or, more simply, 


p dt _ + dr 
R Vl — kr 2 


For a radio signal that leaves a source with comoving radial coordinate at time 
t l} the time of arrival is now delayed by a time At, given by 


Cto + At J± 

p~ = 

J., R 


dr 


o V 1 — hr 


(15.4.45) 


where t 0 is the time the signal would have arrived in the absence of any dispersion, 
that is, 



f ri dr 


o 


Vl — hr 2 


(15.4.46) 
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In all cases of practical importance, v p is much less than v, so /? is very close to 
unity, 


1 - 


v 2 P (t) 
2 v 2 {t) 


= j _ 

2 v 2 0 B 2 (t 0 ) 


(15.4.47) 


where v 0 is the frequency observed at time t 0 . Subtracting Eq. (15.4.46) from 
Eq. (15.4.45) and expanding to first order in At and 1 — /?, we have then 


or, using (15.4.47), 


At 

mo) 


Jt ; 


[1 - ffl 


dt 

R 


At = 


i r 

2vq-K(£ 0 ) » o 


y 2 p (t)R(t) dt 


(15.4.48) 


(15.4.49) 


This total time delay is not just the integral of v 2 /2v 2 over time, as might be 
thought. The extra factor of R(t 0 )IR(t ) appears in Eq. (15.4.48) because the time 
delay that has already occurred when a photon reaches any given point along its 
path causes a slight additional increase in the distance it still has to go. 

It is convenient in evaluating At to change variables from t to 


: m 0 ) , 
m ) 

Equation (15.3.3) then gives 

dt = -ff 0 -1 [l + 2 ?0 z , r 1 ' 2 (l + z'Y 2 dz' 
Also, if free electrons neither appear nor disappear, then 

Now Eq. (15.4.49) gives 

A t = f [1 + 2 q 0 zT 112 dz’ 

2v?H 0 Jo 

and therefore 


At = if 1 + 2 ?o*l 1/2 - 1} (15.4.50) 

2? 0 Votfo 

For instance, suppose that q 0 ~ 1 and H 0 ~ 75 km/sec/Mpc. We then expect 
a present electron number density n e0 ~ 1.2 x 10“ 5 cm -3 [see Eq. (15.4.24)]. 
in which case the present plasma frequency (15.4.44) is v p0 oz 31 Hz. In contrast, 
the frequencies at which quasi-stellar sources are observed 493 to fluctuate are of 
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order 10,000 MHz, which is greater than v p0 by about seven orders of magnitude, 
so the possible time delays are generally quite short. A sharp fluctuation in a 
quasi-stellar source at z ~ 2 will appear to us to occur later at v 0 = 10,000 MHz 
than at very high frequencies by a time delay At ~ 2.5 sec. Unfortunately, 
although quasi-stellar sources do exhibit fluctuations, there do not seem to be any 
fluctuations at radio frequencies that have time scales as short as a few days. 50 
Also, even if such fluctuations did occur, the intergalactic time delay might be 
obscured by dispersion within the source itself. However, if these difficulties could 
be surmounted, then both v p0 and q 0 could in principle be determined by measuring 
time delays for various red shifts and comparing with Eq. (15.4.50). 

A more modest and perhaps more practicable program is to measure the inter- 
galactic electron number density near our galaxy by observing the frequency- 
dependent delay of radio signals from a pulsar in some relatively near galaxy. 
(This is just an extension of the method actually used to determine the distance of 
pulsars within our own galaxy, where the electron densities are reasonably well 
known.) Pulsars are believed to be remnants of supernovae, so they might be found 
in other galaxies by searching for very rapid radio or optical pulses at the sites of 
recent supernovae, such as the one in the galaxy M101, at a distance d ~ 4 Mpc. 
(See Figure 15.2.) At such short distances, we should replace 2 in Eq. (15.4.50) by 



JUNE 9, 1950 FEB. 7, 1951 

Figure 15.2 Recent Supernova in the galaxy NGC5457 (M101); photographed with 
the 200-in. telescope at Mt. Palomar. (Courtesy Mt. Wilson and Mt. Palomar observa- 
tories.) 
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the small quantity H 0 d. Also, newborn pulsars would probably emit about 10 4 
pulses/sec, so we are here really interested in the difference of the time delay at 
neighboring frequencies v 0 and v 0 + dv 0 : 

- (^r) = V ^° V °~ 3 dv ° d 

For instance, if v_ 0 = 31 Hz, the difference in arrival times of pulses from a 
pulsar in M101 at frequencies 1000 MHz and 1001 MHz would be 4 x 10" 4 sec, 
comparable with the expected pulsar period. By working at 100 MHz rather than 
1000 MHz, it would be possible to detect electron densities as low as about 
10 ~ 9 cm” 3 . The problem will be to find a pulsar in some other galaxy. 

Other effects of an ionized intergalactic medium on light signals include 
scintillation, 503 free-free absorption, 5015 and perhaps Faraday rotation. 500 . Only 
scintillation now seems promising as a probe for the missing mass. 


5 The Cosmic Microwave Radiation Background 


The Einstein field equations require that the scale factor R(t) must have been 
extremely small at some finite time in the past (see Section 15.1). At this early 
epoch, matter and radiation were presumably in thermal equilibrium, with a very 
high temperature. As the universe subsequently expanded, both radiation and 
matter cooled. Eventually, when the temperature had dropped to about 4000°K, 
the free electrons joined atoms, so that the opacity dropped sharply, breaking the 
thermal contact between matter and radiation. Whatever radiation existed at that 
time has since been enormously red-shifted, but it still fills the space around us. 

It is widely, though not unanimously, believed, that the microwave radiation 
background discovered in 1965 is just this left-over radiation, red-shifted by a 
factor of approximately 1500 since the universe became transparent. If so, then 
the microwave background provides information of unparalleled value as to the 
history of the universe, not only back to the time when electrons became bound 
but, as we shall see, back much further, to the first few seconds of cosmic history. 

First, let us consider what sort of background radiation spectrum we would 
expect on purely theoretical grounds. The proper energy density of the leftover 
photons, with frequency at the present time t 0 between v and v + dv, is given by 
Eq. (15.4.10) as 


p y o(v) dv = hv X 87TV 2 dv 


'to 

exp 

0 


hvR 0 \ 
JcT(t)R(t) ) 


x A 



P{t 0 , t; v) dt 


(15.5.1) 


where h is Planck’s constant ; k is Boltzmann’s constant ; R 0 is an abbreviation for 
R{t 0 ); T(t) is the temperature of the matter (as opposed to the radiation) at time t; 
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A(v, t) is the absorption rate for a photon of frequency v at time t ; and P{t 0 , t\ v) 
is the probability, taking account of stimulated emission, that a photon of fre- 
quency vE 0 IR(t) present at time t will survive until the present: 


P(£ 0 , t; v) = exp 


fi 


exp 


(15.5.2) 

kT(t')R(t')J] \R(t') ) j 


The lower limit on the integral (15.5.1) can be chosen as any time for which 
P{t 0 , t x ; v) is negligible; surely the choice t x = 0 meets this requirement. 

The formula (15.5.1) can usefully be rewritten in the form 


p y 0 (v) dv = 8 nhv 3 dv 




x 


~ P{p 0’ dt 
dt 


(15.5.3) 


The survival probability P rises from P — 0 at t = 0 to P — 1 at t = t 0 , so this 
is just a weighted average of Planch black-body distributions. If the opacity drops very 
steeply at some time t R , then P is nearly a step function at t = t R , and (15.5.3) 
gives 


p y 0 (v) dv 


Hnhv 3 dv 

[exp (hvIkTy 0) - 1] 


(15.5.4) 


where 


R 0 


(15.5.5) 


Thus, under the assumption of a sharp drop in opacity, the present radiation back- 
ground should have a black-body spectrum , with temperature T y0 . 

It is common to report measurements of the radiation background in terms of 
a radiation flux <j) y0 (v), the energy received per unit time, per unit receiving area, 
per unit solid angle, and per unit frequency interval. The flux can be calculated in 
c.g.s. units from the above formulas for p y 0 (v) by using 

^ 

4n 


The background measurements are also frequently reported in terms of an equiva- 
lent black-body temperature T y0 (v), which is defined to be that temperature for which 
back-body radiation would have the observed energy density or flux at frequency v. 
That is, 


Py oM <*V 


87 chv 3 dv 

[exp ( hv/kT r0 (v )) - 1] 


(15.5.6) 
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A black- body spectrum is then simply characterized as having T y0 (v) independent 
of v. Finally, it is occasionally convenient to report background measurements in 
terms of an antenna temperature TJy), which is defined to be that temperature 
for which the low-frequency Rayleigh- Jeans approximation to (15.5.4) would give 
the observed density or flux at frequency v : 

p y 0 (v) dv = SnkT A (v)v 2 dv (15.5.7) 


Wherever possible, our discussion here will refer to the black-body temperature 
T y0 {v). 

If we make no assumptions about the thermal history of matter before the drop 
in opacity, then all we can say is that the radiation background should have 
roughly a black-body spectrum, with a temperature that tells us the value of 
R(t)IE 0 at the time the universe became transparent. The theoretical situation is 
tremendously improved if we can assume that, during the time that matter and 
radiation were in thermal contact, the matter temperature relaxed according to 
the formula 


T(t) = 


A 

m 


(15.5.8) 


with A a constant. In this case, the first factor in the integrand of (15.5.3) can be 
taken outside the integral, so we get the black-body formula (15.5.4), no matter how 
gradual is the transition from, an opaque to a transparent universe . Further, by taking 
t 0 in (15.5.3) to be an arbitrary time t, we see now that p y is given by a black-body 
formula 


p 7 {v, t) dv 


Snhv 3 dv 

[exp (hvjkT y {t)) - 1] 


(15.5.9) 


with 



(15.5.10) 


at all times — after, during, and before the drop in opacity. It is of course not 
surprising that the radiation would be described by the black-body formula 
(15.5.9) during the time that matter and radiation were in equilibrium, and 
naturally its temperature (15.5.10) equaled the matter temperature (15.5.8) during 
that time. The noteworthy thing here is that the radiation goes on obeying the 
black-body formula (15.5.9), with temperature given by (15.5.10), throughout the 
period of transition from high to low opacity, and thereafter until the present. 
The constant A can be determined by setting t — t 0 in (15.5.8), so the radiation 
temperature at all times is 




(15.5.11) 
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and the matter temperature during equilibrium is the same : 

<i5 - 5i2) 

Thus the present radiation temperature T y0 determines the thermal history of the 
early universe during the whole period when TR was constant. 

In order to see when TR is likely to be constant, let us consider the model of 
an ideal gas in equilibrium with black-body radiation. The energy density of 
black-body radiation is given by integrating (15.5.9) over v: 

p y (t) = aT y 4 (t) 

where, in c.g.s. units, 

a = jk . = 7.5641 x 10 “ 15 erg cm ' 3 deg -4 
15/rc 3 

Thus the total pressure and energy density in this model are 
p = nhT + |aT 4 
p = nm + (7 — 1 )~ 1 nJcT + aT 4 

where n is the number density of gas particles, m is their mass, and 7 is the specific 
heat ratio of the gas, equal to 5/3 for a monatomic gas like atomic hydrogen. The 
equation of particle conservation can be written 

nR 3 = n 0 R 0 3 (15.5.13) 


whereas the equation (15.1.21) of energy conservation reads 


dR 


[ nmR 3 + (7 — 1) 1 nkTR 3 + aT 4 R 3 ] = 


-3 nkTR 2 - aT A R 2 


Using (15.5.13) and rearranging terms, this gives 


RdT 
T dR 


a + 1 

? + i(y - I )" 1 




where ak is the photon entropy per gas particle : 

a = ^l = 74.0 t r ( de g >] 3 

onh ?i(cm“ 3 ) 


For er <§ 1 , Eq. (15.5.14) gives 

T oc IT 3(y_1 > 


(15.5.14) 


(15.5.15) 


(15.5.16) 


which is just the usual temperature- volume relation for the adiabatic expansion of 
an ideal gas. On the other hand, for a > 1, Eq. (15.5.14) gives 


t oc ir 1 


(15.5.17) 
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Even for large er, as matter goes out of equilibrium with radiation, its temperature 
curve ultimately shifts from (15.5.17) to (15.5.16). However, if a is extremely large, 
then as long as there is any significant thermal contact between matter and 
radiation, the radiation will continue to overpower the matter, and the matter 
temperature will have the hoped-for behavior (15.5.8). In this case, (15.5.12), 
(15.5.13), and (15.5.15) give er constant : 


G 


4^0 3 

3 n 0 k 


(15.5.18) 


Hence, if a is ever very large, then it stays very large. We then say that we are 
dealing with a hot universe. In a hot universe, the background radiation approxim- 
ately satisfies (15.5.9) and (15.5.11) at all times, and the matter temperature obeys 
(15.5.12) until the opacity becomes extremely small. Note that the number density 
of photons in black-body radiation is the integral of p y (v)jhv over v, or 

n = 3003) aSy = 3 7 oTy 
y n 4 k k 


so 


(7 - 0.37 ^ 
n 0 

and the condition for a hot universe can be expressed as a requirement that there 
are many photons for each proton or neutron in the present universe. None of these 
considerations gives any clue as to the actual value of T y0> or even as to whether 
this is a hot universe. 

The first theoretical estimate of the radiation temperature was based on a 
theory of element synthesis worked out in the late 1940’s by George Gamow and 
his collaborators. 5 1 (This subject will be discussed in greater detail in Section 15.7.) 
At the time when the temperature was 10 9o K, corresponding to the dissociation 
temperature of deuterium, the number density of nucleons must have been roughly 
10 1 8 cm - 3 , in order that a fraction of order 10 to 50% of the neutrons and protons 
could fuse into heavier elements. The specific photon entropy (15.5.15) at that time 
was then a « 10 11 , so in this model the universe is indubitably hot, and RT y 
would therefore have remained constant, both while the universe remained opaque, 
and thereafter until the present. With a present baryon number density 10“ 6 
cm - 3 , the scale factor at present must be larger than when T » 10 9o K by a factor 
(10 18 /10~ 6 ) 1/3 , or 10 8 , so the present radiation temperature should be 10“ 8 
times 10 9o K, or roughly 10°K. A somewhat more detailed analysis along these 
lines, carried out in 1950 by Alpher and Herman, 52 gave T y0 « 5°K. Unfor- 
tunately, Alpher and Herman went on to express doubts as to whether this 
radiation would have survived until the present. It is of course true that the 
individual photons extant at T » 10 9o K would have been absorbed long before 
now. However, because a > 1, the matter temperature must relax like R~ l , so 
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that the photons emitted just as the universe is becoming transparent must have 
had the same value of TR as during the time of element synthesis. Nevertheless, 
the remarkable prediction of a 5°K black-body radiation background was allowed 
to slip into obscurity. 

The problem of determining T y0 was taken up again in 1965 by Dicke, Peebles, 
Roll, and Wilkinson. 53 They argued that the universe must once have been hotter 
than 10 10o K, because it either has expanded from a singularity with R = 0, or, 
if it undergoes a cyclic oscillation between finite values of R, it must get hot 
enough to dissociate the heavy elements left over from the previous cycle. This 
argument does not fix a value for the present radiation temperature, but Dicke et al. 
reasoned that the energy density of cosmic black- body radiation should not be 
large enough to give q 0 > 1 (see Section 15.2), so that T y0 < 40°K. The really 
important feature of their work, however, was not this estimate, but rather the 
fact that at last the black-body radiation background was being taken seriously, 
with an experiment to measure T y0 being prepared by Roll and Wilkinson. 

The difficult part of measuring a radiation temperature less than 40°K is of 
course that the receiver circuits are at a much higher temperature, so that the 
signal must be hundreds of times weaker than the receiver noise. In order to pick 
out the signal, Roll and Wilkinson planned to use a radiometer invented by Dicke in 
1945. In this device the radio receiver is switched back and forth a hundred times a 
second between one horn pointing at the sky and another looking into a bath of 
liquid helium. The receiver output is filtered to separate just that part that varies 
with a frequency of 100 Hz, and the strength of this filtered output then measures 
the difference between the radiation received from the liquid helium and the sky. 

Before Roll and Wilkinson could complete a measurement of T y0 , they learned 
that Penzias and Wilson 54 had observed a weak background signal at a radio 
wavelength X = 7.35 cm in the large horn antenna at Holmdel, New Jersey, built 
to observe the Echo satellite. The antenna temperature could be fit to the curve 

T a (6) = 4.4°K + 2.3°K sec 9 

where 0 is the angle between the antenna axis and the zenith. The thickness of 
atmosphere (taken as a flat slab) through which the antenna beam passes is pro- 
portioned to sec 0, so the second term could be ascribed to radiation from our 
atmosphere. An additional 0.9°K was estimated as the contribution of ohmic 
losses in the antenna and radiation from the earth into the antenna side lobes, 
leaving for the cosmic microwave background a net antenna temperature 
3.5°K + 1°K. Since TcT A > hx, this is also the equivalent black-body temperature 

T y0 {1.35 cm) = 3.5°K ± 1°K 

This observation, probably the most important to cosmology since Hubble’s 
discovery of the relation between red shift and distance, was published 54 in 1965 
under the modest title ”A Measurement of Excess Antenna Temperature at 



Table 15.1 Summary of Measurements of the Background Radiation Flux at 
Microwave and Far-Infrared Wavelengths. 

(The temperatures listed are those for which black-body radiation would give 
the observed flux at the indicated wavelength.) 


A (cm) 

Method 

Reference 

T y {k) (°K) 

73.5 

Ground-based radiometer 

a 

3.7 ± 1.2 

49.2 

Ground-based radiometer 

a 

3.7 ± 1.2 

21.0 

Ground-based radiometer 

b 

3.2 ± 1.0 

20.7 

Ground-based radiometer 

c 

2.8 ± 0.6 

7.35 

Ground-based radiometer 

d 

3.5 + 1.0 

3.2 

Ground-based radiometer 

e 

3.0 + 0.5 

3.2 

Ground-based radiometer 

f 

2.691 + 

l- 0.21 

1.58 

Ground-based radiometer 

f 

(+ 0.12 

2.78 

1- 0.17 

1.50 

Ground-based radiometer 

g 

2.0 ± 0.8 

0.924 

Ground-based radiometer 

h 

3.16 ± 0.26 

0.856 

Ground-based radiometer 

i 

/+ 0.17 

2.56 

1- 0.22 

0.82 

Ground-based radiometer 

j 

2.9 ± 0.7 

0.358 

Ground-based radiometer 

j' 

2.4 ± 0.7 

0.33 

Ground-based radiometer 

k 

(+ 0.40 

2.46 

l- 0.44 

0.33 

Ground-based radiometer 

k' 

2.61 ± 0.25 

0.263 

CN (J = 1 }J = 0) 

1 

2.3 

0.263 

CN (J = \(J = 0) 

m 

|3.22 ± 0.15 C Oph 
13.0 ± 0.6 C Per 

0.263 

CN (J = 1 jJ = 0) 

n 

3.75 ± 0.50 

0.263 

CN (J = 1 jJ = 0) 

0 

^ 2.82 

0.132 

CN (J = 2 jJ = 1) 

n 

<7.0 

0.132 

CN (J = 2/J = 1) 

0 

<4.74 

0.0559 

CH 

n 

<6.6 

0.0559 

CH 

0 

<5.43 

0.0359 

CH + 

0 

< 8.11 

0.04-0.13 

Rocket-borne IR telescope 

P 

(+ 2.2 

8 - 3 L 1.8 

>0.05 

Balloon-borne IR radiometer 

q 

a 3.6, 5.5, 7.0 

0.6-0.008 

Rocket-borne IR radiometer 

r 

3.1 ( +0 - 5 

1-2.0 

0.18-1.0 

Balloon-borne IR radiometer 

s 

( + 0.4 

2.7 

1-0.2 

0.13-1.0 

Balloon-borne IR radiometer 

s 

2.8 ± 0.2 

0 

2 

0 

0 

Balloon-borne IR radiometer 

s 

^ 2.7 

0.054-1.0 

Balloon-borne IR radiometer 

s 

< 3.4 
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4080 MHz” with the paper by Dicke, Peebles, Roll, and Wilkinson 53 appearing 
as a companion article to explain the fundamental significance of this measurement. 

Although Penzias and Wilson reported their result as an “excess antenna 
temperature,” it is important to realize that they had only measured a radiation 
flux at a single wavelength. It remained to verify the Planck form (15.5.4) of the 
radiation frequency distribution. In Table 15.1 I have listed the measurements of 
the equivalent black-body temperature of the background radiation that have been 
carried out at various microwave and far-infrared wavelengths. 

At wavelengths above 100 cm, the cosmic background is swamped by the 
VHP radiation emitted by our galaxy. In the range from 75 to 0.3 cm, the back- 
ground radiation can be measured with a ground-based microwave radiometer, 
like that employed by Penzias and Wilson and Roll and Wilkinson. However, 
below 2 = 3 cm the emission from our atmosphere becomes extremely troublesome, 
and it is necessary to make observations at mountain altitudes, and at wavelengths, 
such as 0.9 cm and 0.3 cm, where “windows” appear in the atmosphere. Below 
2 = 0.3 cm there are no more useful windows, and the measuring equipment must 
be carried on a balloon or a rocket. In addition, it is possible to infer a background 
temperature at certain wavelengths from the absorption of light by molecules in 
interstellar space. For instance, cyanogen has a visible absorption line at 3874 A, 
corresponding to transitions from the ground electronic configuration to an excited 
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electronic configuration. (See Figure 15.3.) Both electronic configurations are split 
into rotational energy levels, distinguished by the rotational angular momentum 
J, so this absorption line splits into a number of components , 5 5 of which the most 
important are R{0) [J = 0 -+ J = 1; A = 3874.608A], R{ 1 ) [J - 1 -► J = 2 ; 
A = 3873.998A], P(l) [J = 1 -> J = 0; A = 3875.763 A], and R( 2 ) [ J = 2 
J = 3 ; A = 3873. 369A]. (These transitions are governed by a dipole selection 
rule, A J = ±1.) In 1941 McKellar 56 discovered that cyanogen radicals in an 
interstellar cloud between us and the star f Ophiuchi were absorbing light from that 
star, not only in the P(0) transition from the J = 0 ground state, but also in the 
P(l) transition from the first excited rotational state, which is at an excitation 
energy corresponding to a wavelength 2.64 mm. From the relative strength of the 
two absorption lines, a population for the J = 1 state could be inferred, corre- 
sponding to a temperature 2.3°K. McKellar could not be sure that there was not 
some special excitation mechanism at work, so the only conclusion that could be 
reached was that the radiation background at A = 2.64 mm has an equivalent 
black-body temperature less than about 2.3°K. After the discovery of 3.5 K 
radiation at 7.35 cm by Penzias and Wilson , 5 4 Field , 5 7 Woolf , 5 8 and Shklovsky 5 8a 

4 = 3 t 
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4 = 0 


0 < 
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0.066 cm 

0.132 cm 
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Figure 15.3 Transitions in the cyanogen absorptive spectrum used to set limits on 
the cosmic microwave radiation background. 
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independently realized that McKellar’s old observations of f Ophiuchi might 
actually have measured the radiation background temperature, and not just set an 
upper bound on it. This was confirmed by theoretical analyses 57,59 that rejected 
all other rotational excitation mechanisms, and the measurements were repeated, 
now including data 60 on the P(l) absorption line, and from a number of other stars. 
No precise radiation temperature has emerged from these measurements, but it 
appears pretty certain that T y at 2.64 mm is between 2.7 and 3.7°K. There has also 
been an unsuccessful search 60 for the E( 2) absorption line in CN and various 
absorption lines from excited rotational states in CH and CH + , which allows 
upper limits to be set on T y at wavelengths 1.32 mm, 0.559 mm, and 0.359 mm. 

Inspection of Table 15.1 shows that with the exception of the rocket and 
balloon infrared measurements, all observations are consistent with a 2.7°K 
black- body distribution. But before we conclude that a black- body distribution has 
been definitely established, we have to ask how significant this agreement is, 
and we have to worry about the high-altitude infrared measurements. All of the 
data at wavelengths above 1 cm unfortunately lie in that part of a 2.7 °K Planck 
distribution that is very well approximated by the Rayleigh- Jeans law 

p y 0 (v) ^ SnkT y0 v 2 dv (15.5.19) 


which can be obtained by letting v -»• 0 in Eq. (15.5.4). For instance, at A = 1.5 
cm, the flux for a 2.7°K black body is only 15% below what would be given by the 
Rayleigh- Jeans formula (15.5.19), and even at A = 0.856 cm, the Planck flux is 
only 35% below the Rayleigh- Jeans flux. (See Figure 15.4.) This is a serious 
deficiency, because one can imagine a number of models that would give a Rayleigh- 
Jeans curve (15.5.19) down to wavelengths well below the point where the Planck 
law begins to drop below the Rayleigh- Jeans law. For instance, suppose that the 
observed microwave background was emitted at a time t R when the photon 
absorption probability 1 — P dropped sharply, not from 1 to 0, but from some 
value a < 1 to 0. Then instead of a black-body law, Eq. (15.5.3) would give a 
gray-body law, 


Py o(v) dv 


Sncchv 3 dv 
[exp (kv/JcTa) - 1] 


(15.5.20) 


with 0 < <x < 1 and T x = T{t R )R(t R )/R(t 0 ). Then, to account for the data at 
A > 1 cm, it would be necessary to take 


T M 


2.7°K 



a 


and the flux would then be given by the Rayleigh- Jeans law down to a wavelength 
A « a cm. The total radiant energy density must be less than 10 ~ 7 erg/cm 3 (see 
Section 15.2), so a could in principle be as small as 0.08. In order to rule out this 
sort of theory, we must use data for A < 1 cm, and preferably for A < 0.2 cm, 
where a 2.7°K Planck distribution has its maximum. Unfortunately, these are 
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3 X 10 9 Hz 3 X IO 10 Hz 3 X 10 11 Hz 3 X 10 12 Hz 

Figure 15.4 Energy density per frequency interval for 2.7°K black-body radiation. 
The solid curve gives the Planck spectrum (15.5.4); the dashed line gives the Rayleigh- 
Jeans spectrum (15.5.7) for an antenna temperature of 2.7°K. The short vertical 
lines mark frequencies at which the black-body temperature has been measured or 
bounded by radiometer or interstellar absorption observations. 


just the wavelengths where the atmosphere begins to interfere with radiometer 
measurements. The whole case for a black-body distribution, rather than a gray- 
body distribution, therefore rests on the radiometer measurements 61 at mountain 
altitude, which give a flux at ^;3 mm three times less than expected for the 
Rayleigh- Jeans law (Eq. (15.5.19) with T y0 = 2.7°K), plus the absorption spectra 
of interstellar molecules, which give upper limits 62 on the flux, at 2.63 mm, 
1.32 mm, 0.559 mm, and 0.359 mm, less than Rayleigh -Jeans by factors 2.9, 2.2, 
12, and 9.3, respectively. This evidence points strongly to a distribution that does 
not keep going up like the Rayleigh- Jeans law, but bends over steeply around 
0.2 cm, as expected for black-body radiation. 

However, this simple picture is contradicted by some of the data taken in the 
far-infrared by rocket- and balloon-borne equipment. These measurements are 
essentially bolometric ; what is observed is the total power per unit area and solid- 
angle received by detectors having various complicated spectral-response func- 
tions. Initially the rocket observations at Cornell 63 and the balloon observations 
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at M.I.T. 64 both indicated a flux many times larger than expected at these wave- 
lengths for a 2.7°K black-body background. Indeed, taken together with the 
interstellar absorption measurements, these data were not consistent with any 
smooth spectral distribution, let alone a Planck or Rayleigh- Jeans distribution. 
The Cornell measurements have since been re- calibrated 643 and repeated, 646 and 
now indicate a much smaller flux, but the flux is still two orders of magnitude 
greater than expected for a 2.7°K Planck distribution. However, other rocket 
observations 640 and new balloon observations by the M.I.T. group 64d give results 
consistent with a 2.7°K background. These discrepancies might perhaps arise from 
a number of strong lines superimposed on a 2.7°K background, or might be due to 
unexpected sources of atmospheric radiation at high altitudes. These uncertainties 
will probably be with us until far-infrared measurements can be made with 
cryogenic equipment carried by artificial satellites. 

In checking the agreement of the observed flux of background radiation at 
various wavelengths with the Planck formula, it is useful to keep in mind the 
departures from this formula that may be expected on theoretical grounds, even if 
the observed microwave radiation represents a cosmic background left over from 
the early universe. With a black-body temperature T y0 = 2.7°K, the specific 
photon entropy (15.5.15) is a = 1.35 x 10 8 for a present mass density n 0 m N = 
1.8 x 10“ 29 g/cm 3 , or a — 5.4 x 10 9 for n 0 m N — 4.5 x 10“ 31 g/cm 3 . As we 
have seen, these high o - values lead us to expect that the matter temperature T 
would have followed the radiation temperature T y oc R ~ 1 as long as there was 
any appreciable thermal contact between matter and radiation. This expectation 
is borne out by detailed calculations by Peebles of recombination in a universe 
filled with ionized hydrogen. 65 For a present density n 0 m N — 1.8 x 10~ 29 g/cm 3 , 
the fractional ionization dropped sharply from 99.8 % to T y — 5000°K to 0.98 % 
at T y = 3000°K, and then to 0.0053% at T y = 1500°K. However, even though 
the mean free path of photons at these low ionization levels was very long, the 
matter temperature when T y = 2000°K was T = 1920°K, and at T y = 1500°K, 
T — 1280 n K. For smaller values of the present mass density, T followed T y even 
more closely. In consequence, the departures from a Planck distribution should be 
quite small. According to Peebles, the largest effects are the excess of photons left 
over from the 2 p -* Is Lyman a transition and the 2s -> 1$ two-photon transition, 
by which the recombined hydrogen atoms reached their ground states. These 
photons now show up red-shifted by a factor of order > 1000 from X (Lyman a) = 
1215A and X(2y) « 2500A, and thus produce departures from a Planck distribu- 
tion at wavelengths shorter than 0.015 mm. Unfortunately, at these short wave- 
lengths the cosmic radiation background is much less intense than the radiation 
from interstellar dust 66 and gas 67 in our galaxy, so it is unlikely that we shall be 
able to observe these departures from a black -body spectral distribution. 

There is one other important possible source of departures from the Planck 
spectrum. The calculations of Peebles 65 show that by the time the radiation 
temperature dropped to 200°K, the residual hydrogen ionization was extremely 
small, of order 10“ 4 to 10“ 5 . However, the experiment on Lyman a absorption 
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discussed in the last section shows that there cannot have been any appreciable 
amount of neutral hydrogen since a time when T y ~ 8°K, corresponding to z ~ 2. 
If there really is a good deal of intergalactic hydrogen gas, as indicated by measure- 
ments of q Q (see Section 15.2), then somehow or other this hydrogen must have been 
reionized at a time when T y was between 4000°K and 8°K. If the reionization was 
very early, then thermal contact would have been reestablished between matter 
and radiation, and the Planck spectrum would have been distorted by an increase 
in the individual photon energies. According to Sunyaev, 68 the agreement between 
the observed background radiation spectrum with the Planck formula already 
shows that the reionization could not have occurred until T y dropped to about 
800°K. 

There is also a great deal to be learned from the distribution of the microwave 
background in angle. If this radiation really is left over from an earlier period when 
matter and radiation were in thermal equilibrium, then we should expect the 
radiation flux to be isotropic. However, there might be anisotropies of small 
angular scale, owing to inhomogeneities in the primordial plasma, possibly 
associated with the presence of nascent galaxies. 66 (See Section 15.8.) There might 
also be anisotropies of larger angular scale, owing to a departure of the universe as a 
whole or our local gravitational field 69 from perfect isotropy, and there certainly 
is a small anisotropy with a 360° angular scale owing to the motion of the solar 
system relative to the radiation background. If the radiation background does not 
come from an earlier period of thermal equilibrium, then its angular distribution 
may reveal its source ; for instance, if the radiation comes from a large number of 
discrete sources, then we should find large anisotropies of very small angular scale, 
whereas if it comes from our own galaxy, then we should expect a large-scale 
anisotropy correlated with galactic latitude. 

In looking for anisotropies of small angular scale , a large antenna pointed at a 
fixed angle relative to the earth is swept across the sky by the rotation of the earth. 
If no special care is taken to maintain a stable calibration, then the measured 
antenna temperature will show a gradual drift in time, which does not concern us 
here. There will also be a small fluctuation around this general drift, characterized 
by an r.m.s. fluctuation value (AP^)^. If there really is an intrinsic fluctuation 
A T A with an angular scale 6 comparable with the beam width B, then (A T A )* bs 
will be given by (A T A ) 2 plus a term arising from noise in the receiver, so that 

AT A < (AT A ) ohs for 0 * B (15.5.21) 

On the other hand, if the intrinsic fluctuation scale 9 is much less than the beam 
width B , then the beam can be regarded as split into N patches of angular diameter 
9, where 


N 


B' 

6 


2 
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The fluctuation AP in the power P from each patch is given by (15.5.7) as 

A P A T a 



P t a 

The total power received is NP, but the fluctuations have random sign, and so the 
r.m.s. fluctuation in the total power is N 1/2 AP. Taking account of receiver noise, 
the observed relative fluctuation in received power will be greater than N l ^ 2 AP/NP 
so 

(A^)ob s > jy- 1/2 AP ^ jy-1/2 AT a 

T a ~ ~ P ~ t a 

and therefore 

A T a g N l / 2 (AT A ) obs « (Jj (A TJ obs for 6 <B (15.5.22) 

A more detailed analysis 70 shows that for fluctuations of arbitrary angular scale 6 

B 2 T /2 

1 + ^ (A T A ) ohs 

in agreement with (15.5.21) and (15.5.22). For a very strong intrinsic fluctuation 
with AT a & T A , Eq. (15.5.22) sets an upper limit on the angular scale 

0ma* ~ (15.5.23) 

A 

Measurements of (AT^)^ at various wavelengths and beam widths are listed in 
Table 15.2. The anisotropy is evidently less than a few percent on any angular 
scale larger than a few seconds of arc. 

In searching for anisotropies of large angular scale , it is not necessary to use a 
large antenna, but care must be taken to maintain a stable receiver calibration as 
the antenna beam is swept across the sky by the rotation of the earth. In the work 
of Partridge and Wilkinson, 7 1 this is managed by aiming the horn so that it points 
near the celestial equator, and then for 15 min in each half-hour, inserting a vertical 
reflector that aims the beam toward the north celestial pole. With and without the 
reflector, the angle between the antenna beam and the vertical is the same (48°), 
so the effects of heating of the atmosphere, as well as of the apparatus, should be 
the same. However, when the reflector is absent, the beam is scanned across the 
celestial equator as the earth turns, whereas with the reflector inserted, the beam 
points toward a more or less fixed point on the celestial sphere. Hence any change 
with time in the difference between the radiation flux received with and without the 
reflector should be a measure of an intrinsic variation of the flux with right 
ascension (i.e., azimuth) near the celestial equator. This variation must have a 


AT a < 
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Table 15.2 Summary of Measurements of Fluctuations in the Microwave 
Background of Small Angular Scale 


. (cm) 

T a (°K) 

B 

AT Aohs (°K) 

0max 

Reference 

7.35 

2.56 

40' 

0.006 

5" 

a 

3.95 

2.50 

1.4' x 20' 

0.0007 

0.1" 

a' 

2.80 

2.45 

1° 

0.051 

75" 

b 



6° 

0.036 

— 


2.80 

2.45 

10' 

0.0061 

1.5" 

c 



2° 

0.0017 

— 


0.35 

1.14 

~ 75" 

0.024 

1.6" 

d 

0.35 

1.14 

80"“ 100" 

0.008 

0,7" 

d' 

0.34 

1.11 

12.5' 

0.2 

— 

e 


Here A is the wavelength, T A is the antenna temperature for 2.70°K black- 
body radiation, B is the beam width, AT A obs is the observed r.m.s. fluctuation in 
antenna temperature, and d max is the largest angular scale at which the observations 
would allow gross anisotropies [see Eq. (15.5.23)]. The “beam widths” of 1°, 2°, 
and 6° were synthesized by integration of data obtained with a 10' beam width. 
The measurement of A T A at 0.34 cm really is a measure of the change in slope of 
T a {6) over an angular interval 12.5". 

a A. A. Penzias and R. W. Wilson, Ap. J., 142, 419 (1965). 
a ' Yu. N. Pariskii and T. B. Pyatunina, Asfcron. Zh., 47, 1337 (1970) [transl. Sov. Astron.— AJ, 
14, 1067 (1971)]. 

b E. K. Conklin and R. N. Bracewell, Phys. Rev. Letters, 18, 614 (1967). 
c E. K. Conklin and R. N. Bracewell, Nature, 216, 777 (1967). 
d A. A. Penzias, J. Schraml, and R. W. Wilson, Ap. J., 157, L49 (1969). 
d' P. Boynton and R. B. Partridge, private communication. 
e E. E. Epstein, Ap. J., 148, L157 (1967). 


24-hr sidereal period, so it can be Fourier- analyzed into components with periods 
24 jn hr, where n is any integer. 

Measurements of the anisotropy are summarized in Table 15.3. There is 
evidently no statistically significant anisotropy observed, and the maximum change 
in T y0 around the sky is probably less than 1%. 

The upper limits in the 24-hr component of the anisotropy AT y /T y0 are 
particularly interesting, because they set stringent upper limits on the velocity of 
the solar system relative to the rest of the universe. Suppose that there is a 
fundamental reference frame in which the background radiation is perfectly 
isotropic, with a Planck spectrum, and assume that the earth moves with a 
velocity v@ with respect to this fundamental frame. In the fundamental frame, 
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Table 15.3 Summary of Measurements of Anisotropies of Large Angular Scale 
in the Microwave Background 

(The data used in reference e include those used in reference d.) All values of 
AT y /T y0 are based on an assumed value T y0 = 2.7°K. 


A (cm) Type AT y fT y0 (%) Reference 


7.35 

7.35 

r.m.s. 

r.m.s. 

^10 
^ 3.7 



3.75 

24 hr 

0.06 

± 

0.03 

3.2 

py 

0.18 

± 

0.08 

(24 hr 

0.03 

± 

0.08 

3.2 

j 12 hr 

0.06 

± 

0.06 

(24 hr 

0.04 

+ 

0.06 

0.8 

phf 

0.20 

± 

0.24 

24 hr 

0.28 

± 

0.43 


a A. A. Penzias and R. W. Wilson, Ap. J., 142, 419 (1965). 
b R. W. Wilson and A. A. Penzias, Science, 156, 1100 (1967). 
c E. K. Conklin, Nature, 222, 971 (1969). 

d R. B. Partridge and D. T. Wilkinson, Phys. Rev. Letters, 18, 557 (1967). 

e D. T. Wilkinson and R. B. Partridge, quoted by R. B. Partridge, American Scientist, 57, 37 
(1969). 

f S. P. Boughn, D. M. Fram, and R. B. Partridge, Ap. J., 165, 439 (1971). 


the photons within a solid angle sin 6 d6 dtp and a frequency interval dv contribute 
to the energy-momentum tensor an amount 


dT (iV 


PV' 

h 2 v 2 


sin 0 dO dm 

4ti 


P y o(v) dv 


= 2p fl p v h~ 1 [e hv/kT y° - 1]- 1 sin 0 d0 dp v dv 


where p M is the photon momentum four- vector : 

p p = &v(sin 0 cos p, sin 0 sin p. cos 0.1) 


(It follows from (2.8.4) that dT flv is proportional to p p p v : the coefficient of p M p v is 
determined so that the integral of dT 00 over 0 and cp should be p y0 dv.) In the earth 
frame these photons have an energy-momentum tensor given by the tensor 
transformation rule 


dT ,flv = A%A\ dT p<t 
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where A is the Lorentz transformation defined by (2.1. 17)-(2. 1.21), taking v = 
— v 0 . In order to express dT ,,iV in terms of earth-frame quantities, we note that 

V >fl = A\p v 


or, taking the 2 -axis in the direction of the earth’s velocity, 

= v[l - v® cos 9] 

' [1 - ^ 2 ] 1/2 


cos 9' 


[ — + cos 9~\ 

[1 — v @ cos 0] 


<P* = <P 


where 9 is now the angle between the velocities of the earth and photon. The solid 
angle then has the transformation rule 


sin 9' d9 ' d(p' = sin 9 d9 d(p 

and so the differential energy momentum tensor in the earth frame is 


dT ,ttv = 2p ffi p' v h~ 1 [e hv/kTy0 - l] -1 sin 9 d9 d(pv dv 

= 2p'V v h- 1 [e hv ' /kTy0 - I] -1 sin 9 f d9' dtpW dv' 


where 


T 'yo = (’-) T yo = [1 - Vi' 1/2 [1 - »e cos 0F,o (15.5.24) 


We see that dT' has the same form as dT MV , so that the background radiation in the 
earth frame has a Planck spectrum , but with an angle-dependent temperature T y0 . 
For ^ 1, the departure of the measured temperature from the “true” black- 
body temperature T y0 is 

A T y0 ~ - v 9 cos 9T y0 (15.5.25) 

In the experiments of Partridge and Wilkinson and of Conklin, the antenna 
beam scans a circle on the celestial sphere of fixed declination <5 once a day, so 
A T y0 should have a 24-hr period, with maximum value given by 


(Ar v0 : 


yO 




(15.5.26) 


where v @ {3) is the component of the earth’s velocity (in c.g.s. units) along the cone 
of declination S. This maximum is attained when the antenna points in the 
azimuthal direction toward which the earth is moving. The combined data of 
Partridge and Wilkinson 71 give as a most likely velocity v @ (0°) ^ 120 km/sec 
with a direction toward 0 hr right ascension and a vector error of magnitude 



5 The Cosmic Microwave Radiation Background 523 

180 km/sec. Conklin 72 gives as a most likely velocity v e (32°N) ~ 160 km/sec 
with a direction toward 13 hr right ascension (just the opposite to Partridge and 
Wilkinson !) and a vector error of magnitude 85 km/sec. It is reasonable to conclude 
from these two results that 


|i? e | < 300 km/sec (15.5.27) 

This upper limit is already of the same order of magnitude as the velocity of the 
solar system in the local group of galaxies (owing mostly to the rotation of our own 
galaxy) which is estimated 73 as 315 km/sec toward 22 hr right ascension. Clearly 
neither the earth, nor the whole local group of galaxies, is moving at great velocity 
relative to the radiation background. It will be of very great interest to learn how 
fast we are moving, and in what direction. 

Apart from the effects of the earth’s motion or local gravitational fields, the 
microwave background might also exhibit anisotropies owing to a cosmic inhomo- 
geneity at the time t R that the radiation was last emitted or scattered. If there has 
been no scattering of the background radiation since the recombination of hydrogen 
at about 4000°K, then the time t R corresponds to a red shift z R given by 


_ ^0 _ Ty (t R ) 

^(^r) T y Q 


40QQ°K 

2/TkT 


1500 


On the other hand, if there is an intergalactic free electron gas with number 
density 1.2 x i0 _5 /'cm 3 , then, as remarked in the last section, the time of last 
scattering would correspond to a red shift z R « 6. It would be very interesting to 
use the observed isotropy or anisotropy of the present microwave background to 
determine the scales of distances at which the universe is homogeneous or inhomo- 
geneous at the time t R . 

To this end, consider two photons that leave comoving sources A and B at time 
t R , and arrive at the earth at time t 0 , traveling along paths separated at the earth 
by an angle 6 . With the earth taken as the origin, Eq. (14.3.1) gives the radial 
coordinates of the sources A and B as 


where 


r A = r B = r i 


(15.5.28) 


Vl dr 
o VI ~ hr* 


^ dt 

tR W) 


(15.5.29) 


Since photons travel to the earth on trajectories with constant direction x/r, the 
sources will be separated in the Robertson- Walker coordinate system by exactly 
the observed angle d that separates the light rays arriving at the earth. That is. 


V 

r i 


— cos 0 


(15.5.30) 
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with the scalar product defined in terms of the Robertson-Walker coordinates x l 
as if these coordinates were Cartesian: 

v * V />* 1/M 1 I /yi 2 I /y ^ sy ^ 

X A x B ■+- X A X B -f- X A X B 

= r^r B [sin 0 A sin 0 B cos ((p A — (p B ) -f cos 0 A cos 0 B ] (15.5.31) 

Our problem is to determine the proper distance along a geodesic from A to B at 
time t R as a function of 0, for various assumed values of z R ranging from 6 to 1500. 

According to Eq. (14.4.3), the geodesic from A to B can be chosen (setting x l 
equal to a vector ae normal to n) to have the form 

x(p) = np + ae(l — kp 2 ) 1/2 (15.5.32) 

where a is a constant, p is a variable parameter, and n and e are orthogonal unit 
vectors, 

n • e = 0 n 2 = e 2 — 1 (15.5.33) 

the scalar products being defined as in Eq. (15.5.31). The initial and final values of 
p are —p 1 and -\-p 1 , with p t determined by the condition (15.5.28), that is, 

r i 2 = W±Pi)| 2 = Pi 2 + « 2 ( 1 - k Pl 2 ) 

In addition, the condition (15.5.30) gives 

cos 0 = x (+Pi) ,x (~Pi) = [-Pi 2 + g 2 ( 1 - ftp t 2 )] 
r t 2 r t 2 

Both p 1 and a can thus be expressed in terms of r x and 0 : 


p t = r 1 sin 


a — r, cos 


2 ■ 2 0 


1 — kr l sin 


- 1/2 


The proper distance from A to B can now be calculated by integrating the Robert- 
son- Walker line element from — p 1 to -\-p 1 : 


and thus 


d(0) = B(t R ) 


v i 

J -p 


dx(p)\ 2 k(x(p) ■ dx(p)ldp ) 2 \‘ /2 

>'J +— n^T“J p 


d(6) = 


2 E, 


1 + Z R J 


C r i sin (0/2) flp 

0 


yfl — kp 2 


(15.5.34) 
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If the time t R of last scattering or emission occurs after the start of the matter- 
dominated era, then (15.2.5) and (15.3.23) can be used to express B 0 and in 
terms of H 0 , q 0 , and z R , and we find 


m = , 

*oU + z R )*j2q 0 - 1 

x sin -1 ^ 2q ° ~ 1 (go " 1)(~1 + + 1)] g - n $1 

i 9 'o 2 (1 + z r ) 2J 

for q 0 > k = +1 (15.5.35) 

m = {1 - (1 + 2r)- 1/2 } Sin l for q 0 = 1 k = 0 (15.5.36) 

^o{l + Z R ) * 


d(0) = ■: 

H 0 (l + z R ) v 1 ~ ■ 2g 0 

x sinh - 1 ( ^ ~ ^- ZrC L° ~ + ^%q 0 z R + 1)] g - n 0\ 

\ 9'o 2 (1 + z r) %) 

for q 0 < i k = -1 (15.5.37) 


In particular, for 0 -* 0, Eqs. (15.5.35)-(15.5.37) give 

[ZRgQ + (gp ~ 1 )(~ 1 + V 2 ? 0 % + 1)]9 

?0 2 (1 + z r ) 2 H 0 


for 0 —► 0 


If the homogeneity of the universe is achieved by the physical transport of 
energy and momentum from one place to another at velocities less than that of 
light, then we should expect 74 the universe at time t R to be inhomogeneous over 
distances larger than twice the “particle horizon” (15.3.32), because no homo- 
genizing signal could travel from any point to a pair of comoving particles separated 
by a proper distance greater than 2 d H (t R ) by time t R . If this is correct, then the 
microwave background ought to exhibit large anisotropies for angular scales greater 
than an angle 0 H , which can be calculated by equating 2d H (t R ), given by Eq. 
(15.3.33), to d(0 H ), given by Eqs. (15.5.35)-(15.5.37): 


sin 


g 0 ^f^Q Z R 


+ 1 


(15.5.38) 


If z R ~ 1500, then we can use the approximation 

0„ ~ 2 (^) 1/2 ~ 4.2°\/?o (15.5.39) 

(This result would not be very much changed if the matter-dominated era began 
somewhat after the recombination of hydrogen.) If z R ~ 6 and q 0 = \ or q 0 = 1, 
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then 0 H ~ 75°. However, there is no sign of any appreciable anisotropy in the 
microwave background at such angular scales — on the contrary, the microwave 
radiation appears to be highly isotropic on all angular scales greater than 1°. In 
the light of the above analysis, it is difficult to understand how such a high degree 
of isotropy could be produced by any physical process occurring at any time since 
the initial singularity. 

The observed distributions of the radiation background in frequency and angle 
certainly suggest that this is isotropic black-body radiation left over from an earlier 
period when matter and radiation were in thermal equilibrium. However, other 
possibilities are not yet excluded by the data. The energy density of starlight within 
our galaxy is of the order of 5 x 10“ 1 3 erg/cm 3 , just about the same as the energy 
density of 2.7°K black-body radiation. For this reason, Hoyle, Narlikar, and 
Wickramsinghe 75 have suggested that a large fraction of the optical-frequency 
starlight in our own and other galaxies may be absorbed by interstellar grains, 
which are heated to a few degrees, and reemit the energy at microwave frequencies, 
either as a continuum or in discrete lines. It would not be impossible for this re- 
emitted radiation to be isotropic and to imitate a Planck spectrum, but this seems 
artificial. Another possibility that has been widely considered is that the microwave 
background may arise from a large number of discrete sources. 76 Here again, a 
Planck spectrum would not be impossible over the accessible range of wave- 
lengths, but there is no special reason to expect it. Also, in this case the observed 
isotropy does put severe limits on any discrete source theory. For example, if the 
microwave background comes from discrete sources at an average distance of 
order H 0 ~ 1 , then we should expect gross anisotropies in the microwave background 
at an angular scale 9 such that the volume H 0 ~ 3 9 2 contains about one source, 
that is, for 

H 0 ~ 3 9 2 d~ 3 « 1 (15.5.40) 

where d is the mean separation of the sources. The limit 6 < 1 sec given in 
Table 15.2 thus sets an upper limit d < 1 Mpc, about the same as the mean 
separation of galaxies. Detailed analyses 77 of the data in specific models show a 
density even greater than that of galaxies, which would seem to rule out such 
theories. 

The most interesting effects of the cosmic radiation background occur at early 
times, when the temperature was much greater than at present. These effects will 
be the subject of the next six sections. However, even at present, the radiation 
background can have some interesting effects: 

(A) A relativistic electron of energy y e m e will undergo inverse Compton 
scattering on the microwave photons, producing recoil photons of average energy 78 

E = 3.6 7e 2 kT y0 = 8.4 x 10" V eV (15.5.41) 

Hoyle 79 proposed that inverse Compton scattering of cosmic ray electrons 
within our galaxy is responsible for the diffuse background 13 of cosmic X-rays, 



5 The Cosmic Microwave Radiation Background 


527 


but it was pointed out by Gould 8 u that the intensity from this mechanism is several 
hundred times smaller than the observed X-ray background. Soon after, Felton 81 
showed that the inverse Compton scattering of cosmic ray electrons in intergalactic 
space could produce X-rays of the observed intensity. This model has received 
support from the remark of Brecher and Morrison, 82 that an observed kink in the 
cosmic ray electron spectrum at y e « 7 x 10 3 would, according to (15.5.41), 
produce a kink in the spectrum of the diffuse X-ray background at about 40 KeV, 
just where a kink is observed. 1 3 However, the more recent calculations discussed 
in the last section indicate that this kink is due to thermal bremsstrahlung in hot 
intergalactic hydrogen, with inverse Compton scattering important only below 
1 keV. The range of high-energy electrons in a 2.7°K background drops sharply 
for y e > 10 4 , so if cosmic ray electrons really come to us across intergalactic 
space, the observed electron energy spectrum should be cut off sharply at energies 
above 10 GeV. 

(B) It has been observed 83 that the very strong radio galaxy Centaurus A 
emits X-rays in a frequency range 1 to 10 KeV with a total power L x = (1 1 + 4) x 
10 40 erg/sec. By using the theory of synchroton emission to account for the 
observed radio flux, it is estimated 84 that Centaurus A contains about 1.7 x 10 5 9 
ergs in cosmic ray electrons, typically with y e ~ 2.5 x 10 3 . The inverse Compton 
scattering of these electrons on a 2.7°K radiation background would produce 
X-rays, at an average energy given by (15.5.41) as 5 keV, with a total power 
L x ~ 5 x 10 40 erg/sec, in agreement with the observed value. The most important 
aspect of this result is that the predicted X-ray power is most sensitive to the 
background radiation flux at short wavelengths, so that if the early rocket 63 and 
balloon 64 observations really gave the correct temperature at these wavelengths, 
the X-ray power from Centaurus A would be more than an order of magnitude 
larger than observed. However, this interpretation of the Cen A X-ray source is 
still in doubt. 

(C) When a particle of mass m and momentum p strikes a photon of energy w 
at an angle 0, the total energy in the center-of-mass system is 

E c 2 = (w + (p 2 -f- m 2 ) 112 ) 2 — (p 2 + 2 pw cos 9 -f w 2 ) 

= 2 w[(p 2 + m 2 ) 112 — p cos 0] + m 2 (15.5.42) 

In order for a nucleon to have a cross-section on a photon that is of first rather 
than second order in a = 1/137, it is necessary for E c to be greater than the 
threshold m N + m n for the process y + N — ► n + N : 


(p 2 + m 2 ) 1/2 — p cos 9 > 


ml + 2 m s m K 

2 w 


m N m n 

W 


Thus we expect a sharp cutoff 8 5 in the cosmic ray proton energy spectrum at 
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which is just about at the upper limit of current cosmic ray observations. Similarly, 
for cosmic rav photons, the pair production process y + y->e + +e~ gives a 
sharp drop 86 in y-ray range for > 2m e , that is, at an energy 


E 


e, 


max 


2 m e 2 
kT y0 


~ 10 15 eV 


These upper limits apply only if we assume that the high-energy cosmic ray photons 
and protons arise outside our own galaxy. 


It is not yet certain that the observed microwave background really is black- 
body radiation left over from an earlier era. However, the case for this view is 
certainly good enough to warrant a thorough examination of its implications for 
the early universe. We now turn to a consideration of these consequences. 


6 Thermal History of the Early Universe 

The energy density of the present 2.7°K microwave background is 
p y0 = aT y0 = 3.97 x 10“ 13 erg/cm 3 — 4.40 x 10~ 34 g/cm 3 

(15.6.1) 

As already remarked in Section 15.2, this is considerably less than the present 
nucleonic rest-mass density, so that we presently are in a matter-dominated era, 
which has lasted throughout most of the history of the universe. This era was 
discussed in detail in Section 15.3. 

We now turn our attention back to an earlier period, when radiation and 
relativistic particles were more important than ordinary matter. In order to avoid 
losing the thread of our story in the details of ©ur calculations, it may help to 
outline first what is now commonly pictured to be the early history of the universe, 
and then go into the detailed calculations that support this picture. The outline of 
universal history is currently believed to be something as follows (see Figure 15.5) : 

(A) At very early times, when the temperature T was above 10 12o K, the 
universe contained a great variety of particles in thermal equilibrium, including 
photons, leptons, mesons, and nucleons and their antiparticles. The strong interac- 
tions among mesons and nucleons make this era very difficult to study; it will be 
discussed briefly in Section 15.11. 

(B) At the time when T « 10 12o K, the universe contained photons, muons, 
antimuons, electrons, positrons, neutrinos, and antineutrinos. In addition, there 
was a very small nucleonic contamination, with neutrons and protons in equal 
numbers. All of these particles were in thermal equilibrium. 

(C) As the temperature dropped below 10 12o K, the p + and began to 
annihilate. After almost ail muons were gone, at T ~ 1.3 x 10 1 1 o K, the neutrinos 
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Figure 15.5 Thermal history of the early universe. Here T is the temperature of the 
y — e + — e~ plasma, and T v is the temperature of the decoupled v e , v e , v^, and 


and antineutrinos decoupled from the other particles, leaving e ± , y, and a few 
nucleons in thermal equilibrium, with T oc R~ l . (The electron-type neutrinos may 
have remained in equilibrium with the other particles a little longer than the 
muon-type neutrinos, but this makes no difference.) 

(D) As the temperature dropped below 10 llo K ( t ~ 0.01 sec), the neutron- 
proton mass difference began to shift the small nucleonic contamination toward 
more protons and fewer neutrons. 

(E) As the temperature dropped below 5 x 10 9o K (t ~ 4 sec), the electron- 
positron pairs began to annihilate, leaving as the dominant constituents of the 
universe only photons, neutrinos, and antineutrinos in essentially free expansion, 
with the photon temperature 40.1% higher than the neutrino temperature. At the 
same time, the cooling of the neutrinos and antineutrinos, and the disappearance 
of the electrons and positrons, froze the neutron-proton ratio at about 1 :5. 

(F) At a temperature of about 10 9o K (t ~ 180 sec), the neutrons rapidly 
began to fuse with protons into heavier nuclei, leaving an ionized gas of hydrogen 
and He 4 , with about 27% helium by weight, and a trace of d, He 3 , and other 
elements. 

(G) The free expansion of the photons, neutrinos, and antineutrinos continued, 
with T y = 1.4017 T V oc R~ 1 . The ionized gas temperature remained locked to the 
photon temperature until the hydrogen recombined at T * 4000°K. 

(H) At some temperature between 10 3o K and 10 5o K, the energy density of the 
photons, neutrinos, and antineutrinos dropped below the rest-mass density of 
hydrogen and helium, and we entered upon the matter-dominated era. 
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In filling in the details of this history, it will prove very convenient to con- 
centrate in this section on the thermal evolution of the leading constituents of the 
early universe, the photons and leptons, and postpone our discussion of nucleo- 
synthesis to the next section. 

First, let us consider the equation governing the time scale for expansion of the 
early universe. This is somewhat simpler than in the matter-dominated era, because 
the curvature of space may be neglected. For 1c = ±1, the right-hand side of the 
Einstein equation (15.1.20) has a present value given by (15.2.5) and (15.2.6): 

8nGp 0 R 0 2 = 2q 0 

3 |2 q 0 - 1| 


We saw in Section 15.2 that q 0 is probably greater than 0.014, so at present 
8nGpB 2 l3 is greater than 0.03. During the matter- dominated era, this quantity 
varies as 1 /B oc T , so it was greater than 10 when T y was 1000°K, and was even 
larger at earlier times. Hence, during the whole early history of the universe, k was 
much less than the right-hand side of Eq. (15.1.20), and this equation therefore 
simplifies to 


fi2 SnGpB 2 

K = 

3 


(15.6.2) 


It will make no difference in our discussion of the early universe whether space is 
open or closed. 

Now we must consider what were the contents of the early universe. At any 
given time, we can expect to find some particles in thermal equilibrium with each 
other, other particles in free expansion, and perhaps some particles that are just 
passing from one condition to the other. In the ideal-gas approximation, the 
number density n^q) dq of particles of type i with momentum between q and 
q + dq is given in thermal equilibrium by a Fermi or a Bose distribution 87 : 


n t {q) = 4 nh 3 g t q 2 dq 


exp 


~ Ti 
hT 


+ 1 


- 1 


(15.6.3) 


where E t (q) = ( m 2 + q 2 ) 112 is the particle energy, p t is the chemical potential, 
the sign + 1 is +1 for fermions and — 1 for bosons, and g t is the number of spin 
states, with g = 1 for neutrinos and anti neutrinos and a = 2 for photons, electrons, 
muons, nucleons, and their antiparticles. 

The chemical potentials must be determined from a consideration of the 
conservation laws obeyed by the various possible reactions. The basic rule is that 
p t is additively conserved in all reactions. 88 In particular: 

(A) Photons can be emitted or absorbed in an arbitrary reaction in any 
number, so p y = 0. (Equation (15.6.3) then reduces to the Planck distribution 
(15.5.9), with n y = p y \h\ and q = E — hv .) 
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(B) Particle- antiparticle pairs can annihilate into photons, so the chemical 
potentials of a particle and its antiparticle are equal and opposite. 

(C) Electrons and muons can be converted into their associated neutrinos v e 
and v p by collisions with each other or with nucleons, in such reactions as 

e~ + // + -* v e + e~ + p -* v e + n /i” + p v p + n, etc. 

The chemical potentials are therefore related by 

Ee~ ~ Ev e = Ep~ ~ E^ = En ~ E P (15.6.4) 

Altogether there are just four independent conserved intrinsic quantum numbers: 
charge, baryon number (nucleons and hyperons minus antinucleons and anti- 
hyperons), electron -lepton number (e“ and v e minus e + and v e ), and muon-lepton 
number 89 (g~ and v p minus and v M ). Hence there are just four independent 
chemical potentials, which can be taken as }i p , \i e _, }i Ve , g v ^. These four independent 
chemical potentials are to be determined by the values for the charge density 
Nq, the baryon number density N B , the electron-lepton number density N E , and 
the muon-lepton number density N Mi all of which simply vary as R~ 3 . The 
problem of determining the chemical potentials thus leads us to the question: 
What are the values of the four densities N Q , N B , N E , and N M ? 

We know that the average charge density N Q is zero, or at least very small. 90 
We also know that the baryon number density N B is much less than the number 
density n y of photons, because at present N B ~ n p -\- n n — n p — n„ is 8 to 10 
orders of magnitude less than n y , whereas at earlier times N B R 3 was strictly 
constant and n y R 3 oc ( T y R ) 3 was roughly constant. Unfortunately we know very 
little about the present number density of neutrinos, so we cannot estimate the 
value of N e = n e - + n Ve - n e+ - n- Ve or N M = n p - + - n p+ - 

However, since N B is 8 to 10 orders of magnitude less than n y , it is at least a 
reasonable guess that N E and N M are also much less than n y . If so, then it is a 
good approximation to set all the conserved quantum numbers equal to zero : 

N q = N b = N e = N m - 0 (15.6.5) 

Of course, N B is not really zero, and we shall have to put the baryons back into the 
calculation in the next section when we consider the synthesis of the elements, but 
N b can be ignored in calculating the gross thermal history of the early universe. 
The question of whether N E and N M can also be ignored will be taken up at the 
end of this section. 

The problem of determining the chemical potentials is now very easy. The 
chemical potentials of particles and antiparticles are equal and opposite, so the 
four densities N Q , N B , N E , and N M are odd functions of the four independent 
chemical potentials / i p , pL e -, ^ Ve , fi v ^. Hence the values of the /q determined by 
(15.6.5) are simply 


fi, = 0 


(15.6.6) 
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This approximation allows us to deal with energy conservation in a very 
convenient manner. The total energy density and pressure of all the particles in 
thermal equilibrium are now evidently just functions of the temperature alone : 


P«(T) EE £ 

i(eq) J 

EE £ 


T) dq 


t 


i(eq) J \3 E t (q) 


?ii(q; T) dq 


(15.6.7) 

(15.6.8) 


[see Eqs. (2.10.21) and (2.10.22).] According to the second law of thermodynamics, 
the entropy of the particles in equilibrium at temperature T within a volume V is a 
function $( V, T) with 

dS(V, T) = ~ {d(p tq (T) V) + Pcq (T) dV} (15.6.9) 

so that 


dS(V, T) 
dV 

dS(V, T) 
dT 


I {p^T) + P cq (T)} 

Vdp eq (T) 

T dT 


The energy density and pressure must then satisfy the integrability condition 


a | |j, { Pcq (T) + rf)}] 

or, after a little rearrangement, 


s \V dp ca (T) - 

dV\_T dT 


dp eq (T ) 
dT 


1 (Pe q (T) + Pe q (T)} 


(15.6.10) 


[This may also be derived directly from Eqs. (15.6.7) and (15.6.8).] As long as the 
particles in thermal equilibrium interact only with each other, their total energy 
and pressure must separately satisfy the energy conservation equation (14.2.19) : 


Using (15.6.10), this may now be written 

5 [£ (--..m + *.m> 


- 0 


(15.6.12) 


This conservation law has a simple interpretation in terms of the entropy. Using 
(15.6.10) in (15.6.9) gives 


dS(V, T ) = ~d[{ Ptq (T) + p„(r)}F] - T {p eq (T) + Peq (T)} dT 
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so, except for a possible additive constant, 

«(F, T) = | { Peq (T) + p eq (T)} (15.6.13) 

The result (15.6.12) thus simply states the constancy of the entropy in a volume 
R 3 (t)-. 

s = S(R\ T) = 1 f (p eq (T) + p eq (T)} (15.6.14) 

In particular, when all the particles in equilibrium are highly relativistic, we 
can set E = q in (15.6.7) and (15.6.8), so that 

P«(T) = (15.6.15) 

Then (15.6.10) gives 

p eq (T) cc T* (15.6.16) 

with a “constant” of proportionality that depends on just which particle types are 
abundant in equilibrium at these temperatures. [This result can also be obtained 
directly from (15.6.7) and (15.6.8).] Using (15.6.15) and (15.6.16) in (15.6.12) then 
gives a temperature decrease 

T oc 1 (15.6.17) 

R 


We shall see that this holds through most, but not all, of the early history of the 
universe. 

Our next task is to decide which particles were in thermal equilibrium at 
various times. One of the simplifications brought about by our neglect of the 
chemical potentials is that the only particles that can be present in thermal 
equilibrium with appreciable number densities (15.6.3) are those with mass 
m < IcT. For kT < m n , or T < 1.5 x 10 12o K, these are the // ± , e ± , v^, v^, v e , v e , 
and y. (Gravitons are ignored here, for reasons discussed in Section 15.11.) 
Throughout the early history of the universe, the processes of pair production 
and annihilation and Compton scattering kept any extant charged particles in 
thermal equilibrium with the photons. Hence the photons were described by the 
Planck law (15.5.9), and the e ± and ^ were described by a Fermi distribution 
with zero chemical potential : 


n e- (<Z) dq — n e + (7) dq = 87 ih 3 q 2 dq 


n n~ W dq ~ n^ + (q) dq = 8 nh 3 q 2 dq 


exp| 




\Iq 2 + m e 2 \ 

, + iT‘ 

kT ) 

J 

(15.6.18) 

\!q 2 + m 2 \ 

i + 1_ ] 

kT ) 

J 

(15.6.19) 
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What about the neutrinos and antineutrinos ? We know that they can be 
produced, destroyed, and scattered in reactions such as 

e + + n~ <->■ + v„ 

v e + <-► V„ + e + (15.6.20) 

+ /*"•<-> + e - 

As long as kT < the cross-sections for all of these reactions will be roughly of 
order 

~ (15.6.21) 

where g wk = 1.4 x 10“ 49 erg-cm 3 is the weak coupling constant, known from the 
observed rate for the muon decay process p + -*■ e + + v e + At these tem- 
peratures, all particle velocities are of order unity, and (15.6.18) and (15.6.19) 
give the densities of the charged leptons e~ and p ± as 

»i * (jT (15.6.22) 


e + g + <-> v e + 

v e + + e~ 

y fi + ^ + e + 


Hence the rate at which a single neutrino is scattered, and the rate of neutrino 
production per charged lepton, are both of order 

G w k n i ~ Vwkh 1 {kT) 5 (15.6.23) 

The total energy density is roughly of order 

/ kT \ 3 

p kT I— \ (15.6.24) 


so according to (15.6.2), the expansion rate is of order 



(Gp ) 1/2 « G 1/2 h~ 312 {kT) 2 


(15.6.25) 


Hence, as long as kT > m , or T > 10 12o K, the ratio of the reaction rate cr^ to 
the expansion rate H is (now using c.g.s. units) 

^ (15.6.26) 


However, all of the reactions (15.6.20) require either the presence of a p~ or g + , 
or enough energy to make a p~ or p + . When kT < ra^, the number density of 
muons, and the number densities of other particles with energy E > m , are 
reduced by factors of order exp ( — mJkT), and in consequence the ratio of the 
reaction rate to the expansion rate is of order 


T 


3 

exp 


10 12o K 


on l 

h 


10 10o K 


T 


(15.6.27) 
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The neutrinos and antineutrinos drop out of thermal equilibrium with the other 
particles when this ratio falls below unity, that is, at about T « 1.3 x 10 llo K. 

Actually, it may be that the v e and v e remain in thermal equilibrium a little 
longer than the and v^. According to present theories, 91 the weak interactions 
arise from the coupling of a “weak current 5 ’ to itself, either directly, or through 
the agency of a charged spin-1 particle, the “intermediate vector meson.” If this is 
so, then there are additional reactions involving v e and v e , 

e" + e + <-*■ v e + v e e ± + v e -> e ± + v e e ± + v e -> e ± + v e 

(15.6.28) 

whose cross-sections are of order (15.6.21) for kT > m e . These reactions do not 
involve /i*, so the ratio of the v e and v e reaction rates to the expansion rate H 
would be given by Eq. (15.6.26) for kT > m e , that is, down to a temperature 
T ~ 5 x 10 9o K. The reactions (15.6.28) could then keep v e and v e in thermal 
equilibrium with y and e ± down to a temperature T ~ 10 10o K, where the ratio 
(15.6.26) drops to unity. The same may even be true 1 ' * for and v^. 

We are now in a position to work out the thermal history of the early universe. 
Let us start at a temperature between 10 12o K and 1.3 x 10 llo K, when the y + 
and fi~ were rare enough so that their contribution to p eq and^9 eq could be neglected, 
and yet abundant enough to keep the neutrinos and antineutrinos in thermal 
equilibrium with the other particles. The important constituents of the universe 
then were e ± , y, v e , v e , v M , and v all in thermal equilibrium. The photons had a 
Planck distribution, the e ± had the Fermi distribution (15.6.18), and the neutrinos 
and antineutrinos had the Fermi distribution 


KM dq - n~ (q) dq - n v (q) dq = n-(q) dq 


= 4nh 2 q 2 dq\ 


exp [ — 1 + 1 
‘ kT 


(15.6.29) 


Since all these particles were highly relativistic, the temperature was falling in 
obedience to Eq. (15.6.17), that is, T oc E~ 1 . When T dropped to about 1.3 x 
10 1 lo K, the and v , and possibly also the v e and v e , decoupled from the particles 
in equilibrium and began a free expansion. However, this decoupling had no effect on 
any of the distribution functions. The particles remaining in equilibrium still 
constituted a highly relativistic gas, so their temperature continued to drop like 
1/7? . In addition, the number density of the free neutrinos and antineutrinos fell 
like l/E 3 and their momenta were red-shifted by a factor l/E (just as for photons), 
so that the form of the distribution (15.6.29) was preserved, with a neutrino 
temperature T v proportional to l/E. Since T v equaled T before decoupling, and 
T v and T both decreased like \jE thereafter, the neutrinos and antineutrinos 
continued to be described by the Fermi distribution (15.6.29) with T v = T. just 
as if they had remained in thermal equilibrium with the other particles. There 
may have been a second decoupling, of v e and iT at T ~ 10 10o K, but again this 
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made no difference to the neutrino and antineutrino distribution function, provided 
that the v„ and v„ mostly decoupled while the e ± were still relativistic. Thus, during 
the whole of the era 10 12o K > T > 5 x 10 9o K, the neutrinos and antineutrinos 
behaved as if they were in thermal equilibrium, and all particles, y, e ± , vy, v^, v e , 
and v e , were described by Planck or Fermi distributions with the same temperature 
T, falling like 1 jR. The energy densities of the neutrinos and antineutrinos were 
thus 

ft, = ft, = ft B = ft„ = ft (15.6.30) 

where 

1 

= ML ( IT ) 4 = (15.6.31) 

30ft 3 

Also, for kT > m e the e ± were relativistic, so 

ft- = p e ♦ = 2 p„ = laT* (15.6.32) 

(The densities p e ± are twice p v , because the e~ and e + each have two spin states.) 
The total energy density of the universe during the era from T < 10 12o K to 
T ~ 10 10o K was thus 

P = ft. + Py. + Pv B + Pv„ + Pe- + Pe * + Py = 

(15.6.33) 

The story now becomes a little more complicated. Below 10 10o K, the only 
important particles left in thermal equilibrium were the e ± and y. Their entropy 
per volume R 3 is given by (15.6.14), (15.6.7), (15.6.8), and (15.6.18): 

R^ 

S — — {Pe~ + P e + + p y + P e - + P e + + P y ] (15.6.34) 

For T > 5 x 10 9o K, the electrons and positrons were relativistic, so (15.6.15) 
and (15.6.32) apply, and (15.6.34) gives 

4 p 3 

5 = {/V + Pe * + py } = ^{RT) 3 (15.6.35) 

As T dropped below 5 x 10 9o K, the e + and e _ annihilated, eventually leaving just 
photons, with 

4/?3 

S = — p y = URT) 3 (15.6.36) 

Oi 

But s is constant, so the effect of the disappearance of e - and e + was to increase 
RT by a factor 92 

(■«g’)r<io^c = /nV /3 

(RT) t> 5 x 109 o K V 4 / 


(15.6.37) 
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The neutrinos and antineutrinos did not get heated by the electron -positron 
annihilation, so their temperature just continued to fall like R~ 1 . Hence, for 
T < 5 x 10 9o K, we have to distinguish between the temperature T v of the 
neutrinos and antineutrinos, and the temperature T of the photons plus any 
remaining charged particles. Since RT V is constant and RT jumped by the factor 
(11/4) 1/3 , the photon temperature was eventually greater than the neutrino 
temperature by just this factor; 


'*) 



1.401 


(15.6.38) 


In order to determine the behavior of RT or T/T v between 5 x 10 9o K and 
10 9o K, we have to use the expression (15.6.34), or 


where 




(15.6.39) 


ST{x) = 1 + 


45 

2 V* 


i; 


V 2 dy 


fx 2 + y 2 + 


3s/x 2 + y 2 J 

x [exp (\J x 2 + y 2 ) + 1] _1 (15.6.40) 


The constant s can be expressed in terms of the constant RT V by replacing T with 
T v in (15.6.35), so that 


T 


V 


4 

11 


1/3 

T 


ST 



(15.6.41) 


A numerical calculation 93 of the function ST shows that TjT v had risen only to 
1.001 by the time the temperature dropped to 3 x 10 9o K, and T/T v did not reach 
1.4 until T fell below 10 9o K. (See Table 15.4.) 

For T < 10 9o K, the only particles in thermal equilibrium with the photons 
were the small number of nucleons and electrons left over after all the e _ e + pairs 
annihilated. Both T v and T continued to fall like IjR, with a ratio fixed at the 
value (15.6.37). We saw in the last section that the photon temperature T y began 
to differ from the matter temperature T after T dropped below 4000°K, but the 
photon temperature continued thereafter to drop like IjR. Thus there should now 
be a cosmic “black-body” neutrino and antineutrino background described by Eq. 
(15.6.29), with temperature 

T v o = (-n) l,i T y0 = 1.9°K 

From the time when T cs 10 9o K until the present, the energy density of the 
photons, neutrinos, and antineutrinos has been 


PR = Py + Pv e + Pv e + Pv„ + Pv„ 

= aT y + laT 4 v 

= [1 + i(-A~)* /3 laTf = lASaTf 


(15.6.42) 
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This may be compared with the energy density m N n N of nonrelativistic matter, 
which scales as R~ 3 , or T y 3 : 



Hence the critical temperature T c , at which m N n N equaled p R , is 


T 


c 


m N n N 0 

1 A5aT 3 0 


4200 °K 



m N n N0 

_30 g/cm 


3 


(15.6.43) 


For m N n N0 in the range 2 x 10 _29 g/cm 3 to 3 x 10~ 31 g/cm 3 , this temperature 
lies in the range 84,000°K to 1200°K. It may be noted that the temperature 
T r ~ 4000°K, at which ionized hydrogen recombined, lies within this range, so we 
are not certain whether the energy density of radiation was greater or less than that 
of matter at the time when matter and radiation lost thermal contact. This 
uncertainty did not affect our discussion of the microwave background in the last 
section ; the important point there was that the number density of photons is and 
was much greater than that of baryons. 

How long does all this take ? During the era when the temperature was between 
about 10 12o K and 5 x 10 9o K, and also after it dropped below about 10 9o K, the 
only particles present in large numbers were all highly relativistic, so that p ~ pj 3. 
According to (15.1.23), the energy density p varied as 


p oc ir 4 


During these periods, the dynamical equation (15.6.2) may be written 

p = = J^pV 12 

P R \ 3 ) 


The solution is 


t = 


32izGp 


1/2 


+ constant 


(15.6.44) 


During the period when 10 12o K > T > 5 x 10 9o K, the energy density was 
given by (15.6.33), so that (using c.g.s. units) 


t = 


ISnGaT 4 - 


1/2 


+ constant 


= 1.09 sec 


T 


-2 


+ constant 


Starting at T — 10 1 2o K, it took 0.0107 sec for the temperature to relax to 10 1 l0 K, 
and another 1.07 sec for the temperature to drop to 10 l0o K. 
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During the period when 10 9o K > T > T c , the energy density was given by 
(15.6.42), so that 


t = 


\5.5nGaT , 


1/2 


1.92 sec 


f T T 
10 10o K _ 


constant 


+ constant 


The time required for the temperature to drop from 10 9o K to 10 8o K was thus 
about 5.3 hr. If radiation continued to dominate over matter until the hydrogen 
recombined at T = 4000°K, then the age of the universe at the time of recombina- 
tion was 4 x 10 5 years. 

Unfortunately, if we want to describe the behaviour of T(t) and R(t) through- 
out the whole early history of the universe, we have to do a numerical calculation 
to get through the era of electron-positron annihilation. In order to express R in 
terms of T, we use the fact that (15.6.39) was constant from T < 10 12o K until the 
present, provided that after T dropped to 4000°K, we replace T with T y . Thus 


s — 3 a(R 0 T y0 ) 2 


(15.6.45) 


and (15.6.39) can therefore be written 


R 

11 o 


T 

1 yo. 


! 1 ^" 1/3 


m e 

kT 


(15.6.46) 


The energy density p is a function of T, which, for T less than 10 12o K and greater 
than both T c and 4000°K, may be written 


P ~ Py + P Ve + Pv. + Pv M + Pv M + Pe+ + P 
= aT 4 + laT* + 16^” 3 (“ E e {q)q 2 dq ^ 


E e (q)q 2 dq\ex V (^M\ + 1 


kT 


r 


Using (15.6.41) for T v , this is 


p = aT*£ ( — e 

y \JcT 


(15.6.47) 


where 


s( X ) = i + 

30 


+ 


yjx 2 + y 2 y 2 dy [exp {^Jx 2 + y 2 ) + 1] 1 (15.6.48) 


Equations (15.6.46) and (15.6.47) can be used in the dynamical equation ( 15 . 6 . 2 ), 


dt = 


/ 8npG\ 1/2 dR 




R 
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and we find a formula for the time as a function of the temperature : 


t = 


%%GaT A ${ - 
\JcT 


1 > 2 fdT 
T 


dSf{m e jlcT)\ 

3£r(m e lkT) ) 


(15.6.49) 


Results 93 for t, R/Rq, and T/T v as functions of T are given in Table 15.4. 

The only really arbitrary assumption so far has been the conjecture that the 
lepton number densities N E and N M are zero, or at least much less than n y . Let us 
now consider what would be the effect of giving up this assumption. Once the 
temperature had dropped below 10 12o K, the only abundant charged particles were 
the electrons and positrons, so charge neutrality required that N Q — n e+ — n e ~ 
vanished. The chemical potential of the electron must then have vanished, so the 
only particles that might then have had nonvanishing chemical potentials were 


Table 15.4 Thermal History of the Universe, from the Annihilation of p + [i 
Pairs Until the Decoupling of Matter and Radiation 21 


T(° K) R/Rq TfT v t(se c) 


10 12 


1.9 

X 

10- 

-12 

1.000 

0 

6 

X 

10 11 

3.2 

X 

10“ 

-12 

1.000 

1.94 x 

io- 

3 

X 

10 11 

6.4 

X 

10" 

- 12 

1.000 

1.129 x 

10“ 

2 

X 

10 11 

9.6 

X 

10" 

-12 

1.000 

2.61 x 

10- 



10 11 

1.9 

X 

10" 

- 1 1 

1.000 

1.078 x 

10- 

6 

X 

io 10 

3.2 

X 

10" 

- 1 1 

1.000 

3.01 x 

10- 

3 

X 

10 10 

6.4 

X 

10" 

- 1 1 

1.001 

0.1209 


2 

X 

IO 10 

9.6 

X 

10“ 

- 1 1 

1.002 

0.273 




IO 10 

1.9 

X 

10" 

- 10 

1.008 

1.103 


6 

X 

10 9 

3.1 

X 

10" 

-10 

1.022 

3.14 


3 

X 

10 9 

5.9 

X 

10- 

- 10 

1.081 

13.83 


2 

X 

10 9 

8.3 

X 

10" 

- 10 

1.159 

35.2 




10 9 

2.6 

X 

10- 

-9 

1.346 

1.82 x 

10 2 

3 

X 

10 8 

9.0 

X 

10“ 

-9 

1.401 

2.08 x 

10 3 



10 8 

2.7 

X 

10“ 

- 8 

1.401 

1.92 x 

10 4 



10 7 

2.7 

X 

10- 

- 7 

1.401 

1.92 x 

10 6 



10 6 

2.7 

X 

10" 

- 6 

1.401 

1.92 x 

10 8 



10 5 

2.7 

X 

10- 

- 5 

1.401 

1.92 x 

IO 10 



IO 4 

2.7 

X 

10- 

-4 

1.401 

1.92 x 

10 12 

4 

X 

!0 3 

6.3 

X 

10“ 

-4 

1.401 

1.20 x 

10 13 


a The values for i2/J? 0 are derived assuming a present radiation temperature T^ Q — 2. 7 K. The 
last few values for t are derived under the assumption that the energy density of matter was still 
negligible compared with that of photons and neutrinos. Values of TjT v and t for T > 10 8 °K are 
taken from P. J. E. Peebles, Ap. J., 146, 542 (1966). 
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the neutrinos and antineutrinos. For T > 1.3 x 10 1AO K, these particles were in 
equilibrium with y, e + , and e~ , so they were described by the Fermi distributions 


«v„(?) ^ v d i ex p + 1 


d 1 = V dq | exp ( 9 \ + 1 

kT 


- - 1 


and likewise for and v M . The lepton number densities were then 


N t 


Nm = 


[»,.(?) - »*.(?)] d 1 = 


[«».(?) - ™v„(?)] <k = 4:TC 


(f ) v (s) 

) v (« 


where 


V 

Jo 


+ 1] _ 1 ” [e y+ * + dy 


(15.0.50) 

(15.6.51) 

(15.6.52) 

(15.6.53) 

(15.6.54) 


Since electron-lepton number and muon-lepton number are believed to be con- 
served, 89 the densities N E and N M must always vary as R~ 3 . However, we 
have seen that during the era when 10 12o K > T > 5 x 10 9o K, T varies as 1 /R. 
Hence y v JkT and y v JkT must have been constant, from the annihilation of the 
ju + and until the decoupling of the neutrinos and antineutrinos. 

After decoupling, the neutrinos and antineutrinos have expanded freely, with 
number densities dropping as \jR 3 and momenta red-shifted by a factor IjR. This 
free expansion preserved the form of the distributions (15.6.50) and (15.6.51) but 
red-shifted the temperature and the chemical potentials by a factor 1 / R . Hence 
the neutrino distributions, during the whole period from T < 10 12o K until the 
present, are given by 


n Ve {q) dq = 4nh~ 3 q 2 dq 
(q) dq = 4nh~ 3 q 2 dq 
n v ^(q) dq = 4nh~ 3 q 2 dq 
n v (q) dq = 4nh~ 3 q 2 dq 



(15.6.55) 

(15.6.56) 

(15.6.57) 

(15.6.58) 


where T v , ji v , and y v all vary as l/R, with T v = T before the electrons and pos- 
itrons annihilate. The e + — e~ annilhilation was not affected by the neutrino and 
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antineutrino distributions, so all the previous results for T\ and R as functions of 
T still apply. 

If N e and N m are much less than the photon number density n y ~ ( JcT/h ) 3 , 
then (15.6.52) and (15.6.53) give 


Kl kT v I y v J < kT v (15.6.59) 


and the distributions (15.6.55)-(15.6.58) all reduce to the previously used distri- 
bution (15.6.29). 

On the other hand, if N E or N M is comparable with or greater than n y , then 
the constants \y x JkT v \ or \y v JJcT v \ will be of order unity or larger, and the 
distribution functions (1 5. 6.55)-( 15.6.58) will be appreciably different from 
(15.6.29). In the limit when, say, ji v JkT v > 1, the distribution functions (15.6.55) 
and (15.6.56) become 


»,.(?) dq 


'4nh V dq q < p Vii 
0 2 > M». 


(15.6.60) 


»».(?) dq ^ 0 


(15.6.61) 


This is the case of complete neutrino degeneracy. Of course, if y v JJcT v —1, then 
the role of the neutrinos and antineutrinos is reversed in (15.6.60) and (15.6.61), 
and we have complete antineutrino degeneracy. The possibility of complete 
neutrino degeneracy was suggested 94 several years before the discovery of the 
microwave background, when it seemed reasonable to suppose that the universe 
has always been cold enough so that kT v < \y Ve \- 

The only effect that partial or complete degeneracy would have on the 
calculations of this section is that it would shorten the time scale. The total energy 
density of neutrinos and antineutrinos is given by 


where 


Pv 


c 


[n(q) + nAq) + n v (q) + n,{q)]q dq 


L V^v 


= 4nh~ 3 (JcT v ) 4 jF + 


hT, 


(15.6.62) 


&(x) = J {[e y ~ x + I]” 1 + [e y+x + 1 ] _1 }y 3 dy 

This is always greater than the energy density (7/4)aT 4 for zero chemical potential 
[see (15.6.30) and (15.6.31)], so the expansion rate (15.6.2) is increased by de- 
generacy. In the limit where \y v JkT v \ 1 or \y v JkT v \ > 1 or both, we have 

p a p v+f K *J-V + /O (15.6.63) 

The degenerate neutrinos or antineutrinos then dominate the energy density and 
the expansion rate. 
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It is interesting to ask whether we could detect a cosmic background of 
neutrinos and antineutrinos. The most stringent upper limit on \fi v J and |// V J 
comes from measurements of the deacceleration parameter q 0 . Since q 0 is not much 
larger than unity, the total energy density cannot be much greater than about 
10 -29 g/cm 3 (see Section 15.2), and therefore, according to (15.6.63), 

[nt. o + <.o] 1/4 5 0.0075 eV (15.6.64) 

As we have seen, the present neutrino temperature T v0 is about 1.9°K, so JcT v0 — 
1.7 x 10 -4 eV. Thus the upper limit on the chemical potential may be written 


KJ £ 45 \}hA 45 (15.6.65) 

kT v kT v 

Measurements of q 0 therefore do not rule out nearly complete degeneracy. 

We can also try to measure the chemical potentials directly. In allowed /P 
decays, such as H 3 -> He 3 -f e~ -f v e , we normally expect the number of events 
for an electron energy between E e and E e + dE e to be given by the Fermi function 

N F (E e ) dE e = a Pe E e (W 0 - E e ) 2 F(E e ) dE e 


where a is a constant, p e is the electron momentum, W 0 is the maximum electron 
energy, and F(E e ) is a known function that corrects for the Coulomb interaction 
in the final state. However, in the presence of an antineutrino background (15.6.56), 
the Pauli exclusion principle reduces the /P decay rate by a factor equal to the 
fraction of antineutrino states at energy W 0 — E e that are unfilled : 


N(E e ) dE e 


or, explicitly, ^ 


h\jw 0 - E e j 
4n(W 0 - E e ) 2 


N F (E e ) dE e 


ap e E e (W 0 - E e ) 2 F(E e ) dE e 
(15.6.66) 

Since W 0 is much greater than |^ Ve0 | and kT v0 for all known beta decays, this 
correction has little effect over most of the electron spectrum. However, if 
fi Ve 0 < —JcT v0 , the function N(E e ) will show an anomalous depression over the 
range W 0 > E e > W 0 — J// Ve0 |> very much as if the antineutrino had a mass 
l/P c o I ■ If E-v e o > there will not be much of a depression for E e below W 0 , but 
there will be events with E e > W 0 , caused by absorption of cosmic neutrinos in 
reactions such asv e + H 3 — + He 3 . The rate for these events is given 94 by 
the same formula (15.6.66) as for antineutrino emission, except that now fu Ve0 is 
replaced with —p Ve0> and of course E e > W Q . Thus, for p Vc0 > JcT v0 , the /P 
spectrum will rise beyond the endpoint W 0 up to an energy W 0 + p Vc0 > giving the 
appearance of a violation of the conservation of energy. 




N(E e ) dE e = 1 + exp 


E, 


Wo + Mv.O 

kT 


vO 
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By far the best data on the electron spectrum near the endpoint in p~ decay 
come from studies of the low-energy decay H 3 -> He 3 + e~ + v e , with endpoint 
W 0 — 18.7 keV. In a recent experiment, 95 there were found no anomalous 
depressions extending more than about 60 eV below the endpoint, and no anom- 
alous events more than about 60 eV above the endpoint. We can conclude that 

Kol £ 60 eV (15.6.67) 

for a chemical potential of either sign. 

It is also possible to get indirect information about the cosmic neutrino and 
antineutrino background from the survival of cosmic ray protons. A neutrino or 
antineutrino with energy q that is struck at an angle 6 by a relativistic proton of 
energy ym p will appear in the proton rest-frame to have energy 

E ~ yq{\ — cos d) for y > 1 

The total cross-section forpv or pv reactions at a “laboratory” energy E is roughly 

a(E) ~ AE 2 

where, in c.g.s. units, 

2 

A a -ft aj io- 56 cm 2 /eV 2 
h 4 c 4 


The reaction rate for a relativistic proton of energy ym, p in the degenerate v e (or 
v e ) background (15.6.60) is then 


r = 


* t/^V e o| 

o{yq\Y — cos 6])h~ 3 q 2 dq sin 6 dO df> 
o 


or, in c.g.s. units, 

r iny 2 A |/r Vj0 | 5 
15 h 3 c 2 


3 x 10 3i y 2 |^ v . 0 (eV)| 5 sec 


(15.6.68) 


and similarly for degenerate v^’s or v ’s. Bernstein, Ruderman, and Feinberg 96 
have remarked that, since the cosmic ray protons observed with y > 10 6 have 
certainly been traveling for more than 10 6 sec, both |jU Ve0 | and |// v#i0 | must be less 
than 10 3 eV. Cowsik, Pal, and Tandon 97 assume that protons with y « 10 9 
could not scatter more than about 14 times during a flight time of order 5 x 10 7 
years, and conclude that \fX Ve0 \ and |p V m0 I are both less than about 2 eV. 

We can also look for kinks in the cosmic ray proton spectrum at the thresholds 
for various vp or vp reactions. For instance, the threshold for the reaction p -f 
v e -> n -f e + is at m e -f m n — m p = 1.8 MeV, so if g Ve0 is less than — JcT v0 , there 
should be a downward kink in the cosmic ray proton spectrum at 

1.8 MeV 

K.ol 


y 


(15.6.69) 
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Konstantinov, Kocharov, and Starbunov 98 note the existence of a kink at 
y » 2 x 10 6 , and suggest that this may be due to a degenerate antineutrino 
background with 

fi Ve0 ~ -0.8 eV (15.6.70) 

This estimate is very much larger in absolute value than the upper limit (15.6.64) 
allowed by measurements of q 0 . 


7 Helium Synthesis 

The relative abundance of the chemical elements has been under careful 
study by geologists and astronomers since the pioneering work of Frank Wiggles- 
worth Clarke 99 in the last century. Gradually these studies have revealed a 
“cosmic" distribution of abundances, 100 with hydrogen and then helium by far 
the most abundant elements, followed by the group C — N — O — Ne, and with the 
group Li — Be— B and all elements heavier than nickel scarce. The problem of 
explaining these abundances has long been considered one of the major challenges 
facing theoretical astrophysics. 

One possible explanation lay in the nuclear reactions that provide energy to 
the stars. Rutherford’s demonstration of nuclear transmutation in the laboratory 
led Eddington 101 in 1920 to suggest that the sun might derive its energy from the 
fusion of hydrogen into helium. If so, then perhaps the stars (or at least the first 
generation of stars) were formed from pure hydrogen, and have gradually produced 
helium and heavier elements as ashes of their internal fires. The detailed reactions 
by which the stars burn hydrogen to helium were laid out in 1939 by Hans 
Bethe, 102 and the subsequent reactions in which helium fuses into heavier ele- 
ments were explored in the 1950’s in a series of papers by Salpeter, 103 the Bur- 
bidges, Fowler, and Hoyle, 104 Cameron, 105 and others. Most recently, Clayton 
and Arnett 106 have emphasized the importance of stellar explosions as an agent 
in nucleosynthesis. 

There is one other competing theory of nucleosynthesis, worked out in the 
late 1940’s by George Gamow and his collaborators. 107 Gamow reasoned that, 
although the early hot dense period of cosmic expansion was much briefer than the 
lifetime of a star, there was a large number of free neutrons present at that time, 
so that the heavy elements could be built up quickly by successive neutron 
captures, starting with n + p -» d -f y. The abundances of the elements would 
then be correlated with their neutron capture cross-sections, in rough agreement 
with observation. We have already noted in Section 15.5 that the necessity of 
avoiding too much helium production in this theory required the presence of 
black-body radiation, with a present temperature that was estimated 52 as 5°K. 

Both the stellar and the cosmological theories of nucleosynthesis have their 
limitations. There are no stable nuclei with atomic weights A = 5 or A = 8, so 
it is difficult to build up elements heavier than helium by p- a, n-a, or y-y col- 
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lisions. In stars that have converted all hydrogen to helium at their cores, it is 
possible to bridge the gaps at A = 5 and A = 8 by the production of small 
amounts of the unstable nuclide Be 8 in ot-ot collisions, followed by the production 
of C 12 in a-Be 8 collisions. 103 However, the density of the expanding universe at 
the temperature t & 10 9o K is too low to allow much helium burning to occur. It 
is generally accepted today that all elements heavier than helium were syn- 
thesized in stars. 

On the other hand, several authors 108 have noted that the cosmic abundance 
of helium is too large to be easily explained in terms of nucleosynthesis in stars. 
The luminosity-to-mass ratio L/M of our galaxy is about one-tenth the solar ratio 
L 0 jM 0 , or 0.2 erg/gm sec. If the luminosity of the galaxy has remained constant 
during the last 10 10 years, then about 0.06 Mev per nucleon would have been 
produced. In contrast, the fusion of hydrogen into helium releases about 6 MeV 
per nucleon, so not more than about 1 % of the nucleons in our galaxy could have 
been fused into helium (or heavier nuclei) by ordinary stellar processes. As we shall 
see, estimates of the present helium abundance vary, but there is wide agreement 
that the cosmic abundance of helium by mass is considerably greater than 1%. 
It is of course possible that the helium could have been synthesized in an earlier, 
more luminous, epoch of our galaxy; as already remarked in Section 15.5, the 
released energy could, if thermalized, account for the present 2.7°K microwave 
background. However, it is more interesting and more natural to assume that the 
large cosmic helium abundance was produced during the early history of the 
universe, with the energy of fusion mostly lost in the subsequent red shift. 

Let us now calculate the cosmologically produced abundance of helium. It is 
very convenient to divide this calculation into two parts. First, we calculate the 
neutron-proton abundance ratio as a function of time, taking account only of the 
weak interaction processes 


n + v p - f e 


n + e + «-»■ p -f v 


n p + e + v 


(15.7.1) 


(Here v will mean v e .) In the second part of this calculation, we put in the nuclear 
reactions that lead to helium synthesis. 

The number densities of v, v, e~ , and e~ are given here by the Fermi distribu- 
tions (15.6.3), with zero chemical potential and with different temperatures T or 
T v for e ± (and y) or v and v: 


n e - 

W v 


ip) 


L \ ^ / 

dp = n-(p) dp = 4 -ti h~ 3 p 2 dp j^exp (^f^ + lj 


E e (p) = (P 2 + m e 2 ) 112 


where 


E,(P) = P 
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The rates for the various reactions (15.7.1) are given by the “ V~A” theory of weak 
interactions, 1 0 9 except that the Pauli exclusion principle supresses these rates by a 
factor equal to the fraction of all states that are unfilled : 


1 "[ exp (S 1 + 1 


1 + exp 


r-i 

-[ exp (&) + 1 ] 1 = [ l + cxp 


■E. 


kT 

-E y 

~¥T 


The rates (per nucleon) of the processes (15.7.1) are then 

X{n + v -* p + e“) = A J v e E*Pv dp v [e Ev/kTv + 1] ~ 1 [1 + e - £ e/ fc7 ’]“ 1 

(15.7.2) 

X(n + e + -> p + v) = A j* E 2 p 2 dp e [e E ' lkT + 1]-‘[1 + e~ E ' lkT 'Y 1 

(15.7.3) 

/% 

X(n p + + v) = A v e E 2 E 2 dp y [ 1 + e -E»/‘r.]- i[ X + e -E./kTyt 

(15.7.4) 

X(p + e~ -> n + v) = A | ’ E 2 p 2 dp e [e E ‘l kT + 1] _1 [1 + e - E ' lkT 'Y l 


(15.7.5) 


X(p + v ^ n + e + ) = .4 v e E ePl dp v [e E ' /kT '’ + 1] ‘[1 + e £ * /ltr ] 1 

%) 


(15.7.6) 


X(p + e + v -► n) = A 


Here A is the constant 


r MV* dp v [e E ‘' kT + 1]' 1 [e £v/ * rv + 1]“’ 


A = 


9v + 3g)t 
2n 3 h 7 


(15.7.7) 


(15.7.8) 


where g v and g A are the vector and axial vector coupling constants of the nucleon, 
taken here to have the values 


g v = 1.418 x KT 49 erg cm 3 


9a = 


(15.7.9) 
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Also, E e and E v are related by 


E e - E V = Q 

for n + v <-► p + e~ 

(15.7.10) 

II 

1 

for n + e + <-> p + v 

(15.7.11) 

E v -h E e = Q 

for n <<-* p 4- e~ 4- v 

(15.7.12) 

III 

§ 

1 

m p = 1.293 MeV 

(15.7.13) 


The integrals (15. 7. 2 )-( 15.7.7) are taken over the positive values of p v and p e 
allowed by these relations. It is very convenient to write all integrals in terms of a 
variable of integration q, taken as E v in Eqs. (15.7.2), (15.7.4), and (15.7.5), and 
as —E v in Eqs. (15.7.3), (15.7.6), and (15.7.7). Replacing p e 2 dp e with v e E e 2 dE e , 
the total n -* p and p -» n transition rates are then 


X(n -» p) = X{n + v -> p 4- e ) + X(n + e + -► p + v) + X(n -» p + e + v) 

= a J(i - 1/2 (Q + ?)V 

x (1 + e~ (s+q)/kT )~ 1 (15.7.14) 


X(p -> n) = X(p + e -> n + v) + X(p + v -> n + e + ) + + e + v -> ») 

= a J*(i - (Q + ?)V <k(i + r* 

x (1 + e< Q+q)lkT )~ l (15.7.15) 


The integrals now run from — oo to + oo , leaving out a gap from — Q — m e to 
— Q + m e . The differential equation for the ratio X n of neutrons to all nucleons is 


X(n -> p)X n - X(p ->w)(l ~ X n ) 


(15.7.16) 


The solution of this equation has been calculated by Peebles, 93 and is presented 
here in Table 15.5. Although the quantitative behavior of X n (t) can only be 
obtained by a numerical integration, it is possible to appreciate the main features 
of this solution through a few qualitative observations : 

(A) For IcT > Q, we can set T — T v and put Q and m e equal to zero in Eqs. 
(15.7.14) and (15.7.15). The transition rates are then 


X(n -> p) ~ X (p n) ~ A q 4 dq( 1 + e q/kT ) *(1 + e q!l 

J — 00 

= **M(W) 5 = 0.361 sec- 1 


(15.7.17 
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Table 15.5 Neutron Fraction X n as a Function of Temperature or 
Time, with Neglect of the Formation of Complex Nuclei a 


T (deg) 

t (sec) 


10 12 

0 

0.496 

3 x 10 11 

0.001129 

0.488 

10 11 

0.01078 

0.462 

3 x 10 10 

0.1209 

0.380 

10 10 

1.103 

0.241 

3 x 10 9 

13.83 

0.170 

1.3 x 10 9 

98* 

0.150 

1.2 x 10 9 

119* 

0.147 

1.1 x 10 9 

146* 

0.143 

i— ' 

o 

x 

i— i 

o 

o 

182.0 

0.137 

x 

o 

00 

226* 

0.131 

00 

O 

X 

GO 

290* 

0.123 

00 

O 

X 

L~~ 

383* 

0.112 

3 x 10 8 

2080 

0.021 

10 8 

18700 

10" 8 


a Values of t are taken from the calculations of P. J. E. Peebles. Astron. J.. 146. 
542 (1966), except that values marked with an asterisk are interpolated from Peebles’ 
results. Values of X n for T > 1.0 X 10 9 °K are taken from Peebles, op. cit.. Table 4. 
Values of X n for T < 10 9 °K are calculated from the value at 10 9 °K, under the assump- 
tion that X n decreases exponentially at the rate (1013 sec) -1 of free neutron decay. 


This may be compared with the “age” t, given by Eq. (15.6.44) and (15.6.33): 

t = 1.09 sec ( — - — ^ (15.7.18) 

^10 1Oo K/ 

We see that the product It is larger than 10 for T > 3 x 10 10o K, so at these 
temperatures the neutron fraction X n should be given by the equilibrium solution 
of Eq. (15.7.16), which is 


X 


n 


X{p n) 

l{p -► n) + X{n p) 


(15.7.19) 


Note that Eq. (15.7.17) will not be quantitatively correct when T drops to near 
3 x 10 10o K, because JcT is then not very much larger than Q. However, even 
though the rates X{p -»■ n) and X(n -> p) may differ somewhat from Eq. (15.7.17) 
and from each other, they are still large enough for T > 3 x 10 10= K to justify 
the use of the equilibrium solution (15.7.19). 
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(B) As long as T v ~ T (that is, for T > 10 10o K) the rates (15.7.14) and 
(15.7.15) have the ratio 


X(p -* n) 


= exp 


\lcT ) 


(15.7.20) 


X{n ->p) 

Thus Eq. (15.7.19) gives the neutron abundance for T > 3 x 10 10o K as 

X n ~[l+ e Q!kT y 1 (15.7.21) 


The neutron abundance starts at X n ~ \ at very early times, and drops slowly as 
the temperature falls, reaching X n ~ 0.38 for T = 3 x 10 10o K. It is a pro- 
foundly important fact that the initial condition for Eq. (15.7.16) does not have 
to be chosen arbitrarily, and does not depend on any detailed model of the very 
early universe, but follows directly from the singular behavior of the rates X as 
% - 0 1O9a . 


(C) Once T drops to about 1.3 x 10 9o K, the rates of the two- and three-body 
reactions w + v <-> p + e - , n + e + p - f v, and p + e~ + v -» n become 
negligible. The only remaining reaction is the “one-body” process n -* p + 
e" + v, which at these low temperatures proceeds at the rate of free neutron 
decay, taken here to have the value used by Peebles 93 : 

A _1 (n -> p + e~ + v) - 1013 sec (15.7.22) 


Thus the neutron abundance from the time when T ~ 1.3 x 10 9o K to the start 
of nucleosynthesis is given by 


X n (t) = N exp 




£(sec) 

1013 


(15.7.23) 


The only part of the theory of helium synthesis in which detailed numerical 
calculations are really needed is the evaluation of the constant N. In carrying out 
this calculation, it is convenient first to ignore neutron decay as well as nucleo- 
synthesis, in which case the neutron abundance is a function X^(t) that approaches 
a finite limit as t -*■ oo. (This is the quantity called X n in Peebles’ Table 1 93 . Peebles 
also ignores the process which is in fact negligible over the whole 

period of interest.) Neutron decay has a negligible effect until t ^ 20 sec, whereas 
after that time the temperature is below 3 x 10 9o K, so that the rate X(p -> n) is 
negligible compared with X{n p), and lepton degeneracy has little effect on the 
rate of neutron decay. It follows that the whole effect of neutron decay is to 
multiply -3 l^ 0) (£) with an exponential decay factor: 


x „(t) m XfXt) exp 



(15.7.24) 
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Peebles 93 finds that approaches the value 0.1640 as t — ► oo, so comparing 
(15.7.23) with (15.7.24), we have 


N ~ X< 0) (oo) - 0.1640 


(15.7.25) 


Now we can proceed to the second part of our calculation, and put in the 
nuclear reactions that lead to the synthesis of complex nucleii. At early times, 
when T |> 10 10o K, the various nuclei would be in thermal equilibrium, with the 
number density n i of nuclide i given by Eq. (15.6.3). Since the nuclei are highly 
nonrelativistic and nondegenerate during the whole period that concerns us here, 
we can use the Maxwell-Boltzmann approximation to Eq. (15.6.3), and write for 
the total number density of species i : 



(15.7.26) 


Of course, we are not given the chemical potentials /q, but we know that they are 
conserved in all reactions. Hence, if nuclear reactions can rapidly build up a 
nucleus i out of Z t protons and A t — Z t neutrons, then jj, i is given by 


Pi = Z if i p + (A, - Z,)n„ (15.7.27) 

It is convenient to write (15.7.26) as a relation between the fractions by weight of 
nuclide i, free neutrons, and free protons: 




X = ^ 


n N 


where n N is the total number density of nucleons, hound or free : 

__ Pno ( ZoY 

m N \RJ 


n N = n 


'N ~ ,tf N0 


v 3 

R 


Using (15.7.27), and approximating m p — m n = m N and = A t m N in the 
3/2-power in (15.7.26), we have then 


where B t is the binding energy 

Si = m, - Z t m p - (A t - Z> } m n 


(15.7.28) 


(15.7.29) 
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and £ is the dimensionless quantity 
e = %h 3 n N (2nm N kT)~ 3/2 


= ! 61 x 10- ^ V * VV_g_V 3/2 

ylO - 30 g/cm 3 y \10 _ 1 °Rq) ^lO 100 ^ 

(15.7.30) 

Since s is very small in the period of interest, the abundance of a given complex 
nuclide i will be very small until T drops to the value 

T, ~ ^ (15.7.31) 

k{A t - 1) |ln e| 

Values of are given for various nuclides and various values of the present 
density p N0 in Table 15.6. Note that T t depends only very weakly on the present 
density p N0 , because p N0 enters only in the quantity |ln e|, and this quantity has 
a value between 25 and 35 over the whole range of relevant temperatures and 
densities. 


Table 15.6 Values for the Temperature T f Defined by Eq. (15.7.31), for Various 
Nuclides and Various Values of the Present Density p N0 a 



B 


T. (10 9o K) 


Nuclide 

Jc{A - 1) 
(10 9o K) 

Pno ~ 10 29 
g/cm 3 

Pno = 10“ 30 
g/cm 3 

Pno — 10 31 
g/cm 3 

H 2 

25.8 

0.83 

0.77 

0.72 

H 3 

49.3 

1.6 

1.5 

1.4 

He 3 

44.6 

1.4 

1.3 

1.2 

He 4 , etc. 

109 

3.9 

3.6 

3.3 


a Heavier nuclides have about the same values of T ■ as He 4 . In thermal equilibrium, T i is the 
maximum temperature at which a nuclide i could be abundant. 


If nuclear abundances really were governed by the conditions of thermal 
equilibrium down to temperatures of order 10 9o K, then according to Table 15.6, 
we should expect He 4 and heavier nuclides to appear first, followed by He 3 and 
H 3 , and finally by IT 2 . However, this is not at all what happens, because thermal 
equilibrium is not maintained down to 10 9o K. The number densities at all but the 
very earliest times are too low to allow nuclei to be built up directly in many-body 
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collisions like 2n + 2p -+ He 4 . Complex nuclei must instead be built up in 
sequences of two-body reactions, such as 

p + n d + 7 
d + d He 3 + n <-» H 3 + p 
H 3 + d «-> He 4 + to 


etc. 


(15.7.32) 


There is no problem with the first step ; the rate of deuterium production per free 
neutron is 


X d = [4.55 x 10 20 cm 3 /sec]TO J 


= 27.4 sec - 1 ( -%—X 3 ( T Pm -A 

l in~9r> / l in- 30 3 / 


/ i r rr no \ 


and this is so much faster than the expansion rate 1// [see Eq. (15.7.18)] that 
deuterons will appear with the equilibrium abundance (15.7.28): 




(15.7.34) 


However, no appreciable quantity of H 3 , He 3 , He 4 , or heavier nuclei can be formed 
until this equilibrium deuterium abundance is high enough to allow d-d, d-p, or 
d-n reactions to proceed at an adequate rate. According to Table 15.6, the equi- 
librium deuterium abundance (15.7.34) is very small for T greater than about 
0.8 x 10 9o K. The low binding energy of deuterium thus serves as a “bottleneck,” 
which delays the formation of complex nuclei until T drops to near 0.8 x 10 9o K, 
or a little earlier in models with a relatively high baryon number density. 

Once nucleosynthesis begins, it proceeds very rapidly, because, according to 
Table 15.6, any temperature less than 1.2 x 10 9o K is low enough to permit 
high equilibrium concentrations of nuclei heavier than deuterons. However, it is 
not in fact possible to produce appreciable quantities of elements heavier than 
helium because, as mentioned above, the lack of stable nuclides with A = 5 or 
A = 8 impedes nucleosynthesis via to-oc, p-a, or a-a collisions, whereas the 
Coulomb barrier in the reactions He 4 + H 3 ->Li 7 + y and He 4 + He 3 ->■ 
Be 7 + y prevents them from competing effectively with p + H 3 -> He 4 + y 
or n + He 3 — ► He 4 + y. Thus the effect of the nuclear reactions (15.7.32) is very 
rapidly to incorporate all available neutrons into He 4 nuclei, which have by far 
the highest binding energies of all nuclei with A < 5. 

The nucleosynthesis process can only be followed in detail by a numerical 
integration of a large number of rate equations. This has been done by Peebles 93 
for the reactions (15.7.32), and by Wagoner, Fowler, and Hoyle 110 for the reac- 
tions (15.7.32) plus the radiative processes 

p -f d He 3 + y n -f d H 3 -f y p + H 3 He 4 + y 
n + He 3 He 4 + y d + d <-> He 4 + y 


(15.7.35) 
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plus a large number of other processes leading (weakly) up to nuclei as heavy as 
Mg 24 . Fortunately, none of these complications are relevant to our basic problem, 
that of understanding the helium abundance. All processes proceeding by strong 
and electromagnetic interactions, such as the reactions (15.7.32) and (15.7.35), 
will conserve the total numbers of protons and neutrons. The only effect of nucleo- 
synthesis on the neutron-proton ratio is that, by “turning off” the decay of free 
neutrons, it freezes this ratio at the value it had just before the onset of nucleo- 
synthesis. Before nucleosynthesis begins, the ratio of neutrons to all nucleons is 
simply the quantity X n given by Eq. (15.7.23). After nucleosynthesis is over, we 
have essentially nothing left but free protons and He 4 nuclei, so the fraction of 
neutrons to all nucleons is one- half the fraction of all nucleons that are bound in 
helium, or one-half the abundance by weight of helium. Thus the abundance by 
weight of cosmologically produced helium is simply given by 

Y = X Be « (after nucleosynthesis) = 2X n (just before nucleosynthesis) 

(15.7.36) 

According to the detailed calculations of Peebles, nucleosynthesis begins abruptly 
at a temperature 0.9 x 10 9o K for p N0 = 7 x 10” 31 g/cm 3 or at 1.1 x 10 9o K 
for p N0 — 1.8 x 10” 29 g/cm 3 , just about as we should expect from our qualitative 
considerations. Using Eq. (15.7.36), we can read off from Eq. (15.7.23) or Table 
15.5 that for these two values of the present density, the helium abundance by 
weight should be 26.2% or 28.6%, respectively. (Peebles 93 actually gives 25.8% 
and 28.2% in these two cases. This very slight discrepancy is simply due to the 
small number of free neutrons that decay during the brief duration of nucleo- 
synthesis.) It is safe to say that in the class of cosmological models considered 
here, a helium abundance by weight of about 27% would be produced cosmolog- 
ically for any reasonable value of the present density. The reason the helium 
abundance is so insensitive to the baryon number density is that the neutron- 
proton ratio before nucleosynthesis is determined by the interaction of the 
nucleons with the huge number of leptons, not with each other, while the onset of 
nucleosynthesis is essentially determined by the temperature, not the nucleon 
density. 

Wagoner, Fowler, and Hoyle 110 have calculated the cosmologically produced 
abundances, not only of the isotopes of hydrogen and helium, but also of Li 7 and 
heavier elements. Their results are given in Table 15.7. Note that the abundances 
of all nuclides except H 1 and He 4 are extremely small, so that production or 
destruction of these nuclides in stars could have a serious effect on their observed 
“cosmic” abundances. For this reason it is primarily the cosmic abundance of 
He 4 that serves as a check on models of the early universe. (However, Geiss and 
Reeves 110a argue that the H 2 and He 3 observed in the solar system does in fact 
arise from the early universe. If this is correct, then the cosmic density must be 
rather low, with a present value of order 3 x 10” 3 1 g/cm 3 , to prevent the nuclear 
reactions, which build up H 2 and He 3 into He 4 , from proceeding to completion.) 



Table 15.7 Cosmologically Produced Abundances (by Weight) of Various Nuclides, for Various Values of the Present 
Density O Tkin 
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There are a number of different methods by which the helium abundance can 
be measured in different parts of the universe. 

(A) Stellar Masses and Luminosity. The theory 111 of stellar structure and 
evolution allows us in principle (and even in practice) to calculate a star’s lumi- 
nosity L as a function of time if we are given its mass M and initial chemical 
composition. The chemical composition is usually specified by three numbers, X, 
Y, Z, defined as the fraction by weight respectively of H 1 , of He 4 , and of every- 
thing else, with 

X + Y + Z = 1 

(The heavy-element abundance Z, though usually small, is an important parameter 
for any star in radiative equilibrium, such as the sun, because it determines the 
opacity of the star at a given density and temperature. The helium abundance Y 
is important because it governs the mean molecular weight appearing in the iueal- 
gas law.) If we can guess Z and the age of a given star, then comparison of theory 
with measured values of M and L allows us to compute Y. 

The best-studied star is, of course, the sun. Its mass and luminosity are known 
quite accurately, and its age is believed to be close to the age of the earth, or about 
4.5 x 10 9 years. From the absorption lines of hydrogen and heavy elements it has 
been estimated 112 that Z/X is about 0.026 to 0.027 in the solar photosphere, 
though a more recent study 113 gives Z/X ~ 0.019. (Unfortunately, helium lines 
are much too weak for Y/X to be measured in the sun by this method.) Usually 
solar evolution calculations are carried out for values of Z in the range 0.01 to 
0.04. At the time of the discovery of the cosmic microwave radiation the best 
solar models 114 gave an initial helium abundance Y = 0.27 for Z = 0.02 (or 
Y = 0.32 for Z ~ 0.04), so it was regarded as a great victory for the “big-bang” 
cosmology that, with T y0 ~ 3°K, it gives a primordial helium abundance Y ~ 0.27. 

Unfortunately, this happy state lasted only until the advent of neutrino 
astronomy. The same solar models that are used to calculate Y also can be used 
to predict the flux of neutrinos from various nuclear reactions in the sun. The sun 
derives its energy from the fusion of hydrogen into helium in a proton-proton 
cycle, starting with the reactions 

H 1 + H 1 -» H 2 + e + + v (J v = 0.263 MeV) 

H 1 + H 1 + H 2 + v (E v = 1.4 MeV) 

H 2 + H 1 He 3 -f y 

The cycle can then terminate with the “PP I” branch 

He 3 + He 3 -> He 4 + 2H 1 
or it can produce Be 7 by the reaction 

He 3 + He 4 Be 7 + y 
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In the latter case, one ±Je 7 nucleus and one proton are converted into two He 4 
nuclei by either the “PP II” branch 

Be 7 -f e" Li 7 -f v (J v = 0.80 MeV) 

Li 7 + HU He 4 + He 4 
or the “PP III” branch 

Be 7 + H 1 -> B 8 + y 
B 8 _► Be 8 + e + + v (E v = 7.2 MeV) 

Be 8 _► 2 He 4 

(Mean neutrino energies are given in parentheses.) Pontecorvo 1 1 5 and Alvarez 116 
suggested that the neutrinos could be detected in Cl 37 through the endothermic 
reaction 

v + Cl 37 -► e~ + Ar 37 (15.7.37) 

The Ar 37 decays by electron capture with a convenient half-life of 35 days, so it 
can be detected by its radioactivity after chemical separation. As pointed out by 
Bahcall, 1 1 7 the energetic neutrinos from B 8 beta decay are particularly effective 
in the reaction (15.7.37), because they can induce superallowed transitions to an 
excited state of Ar 37 . Hence, even though the PP III branch is much less important 
than the PP II branch, about 90% of the neutrino absorption events in Cl 37 
would be expected to arise from B 8 neutrinos, and about 10% from Be 7 neutrinos. 
Using the extant solar models 1 1 4 with Y = 0.27, Bahcall 1 1 7 calculated a neutrino 
captufe rate on the earth of (4 + 2) x 10 -35 sec - 1 per Cl 37 atom, and Davis 118 
set out to measure this rate, using 100,000 gallons of perchlorethylene (C 2 C1 4 ), a 
common cleaning fluid, in the Homestake gold mine at Lead, South Dakota. In 
1968 Davis et al. 1 1 9 announced that they had failed to detect any solar neutrinos 
and could set an upper limit of 0.3 x 10 - 3 5 sec - 1 on the absorption rate per Cl 3 7 
atom, about an order of magnitude less than what had originally been expected! 
This discrepancy between theory and observation, in the first experiment that ever 
looked directly into the solar interior, has shaken the general faith in accepted 
solar models, and in the values they yield for the initial helium abundance of the 
sun. Needless to say, a great deal of work has been put into recalculation of the 
expected neutrino fluxes, using improved values for the opacity and for various 
nuclear reaction rates. In a companion paper to the letter of Davis et al., 119 
Bahcall et al. 120 estimated an absorption rate for Z — 0.015 of (0.75 + 0.3) x 
10“ 3 5 sec - 1 per Cl 37 atom, still too large by a factor of two, and calculations using 
the Berkeley stellar structure code gave slightly larger rates. 121 Iben 122 has 
noted that both Y and the neutrino absorption rates are increasing functions of Z f 
and that the minimum possible absorption rate, attained by Z — 0 and Y ^ 0.17, 
is just about the same as Davis et al.’s upper limit. The latest calculation of 
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Bahcall and Uhlrich 122 " gives a counting rate of (0.9 + 0.5) x 10“ 35 sec -1 per 
Cl 37 atom. 

Meanwhile, Davis’ group continued their observations, and have recently 
announced a counting rate of (0.15 + 0.1) x 10“ 35 sec - 1 per Cl 37 atom, 122i> 
about six times less than expected. In view of this discrepancy, the question of the 
initial solar helium abundance must for the present be regarded as unsettled. 

The masses and luminosities are also known for a number of nearby Population 
I stars that happen to belong to binary systems. Comparison of these M and L 
values with the theoretical Y -dependent M—L relation yields Y values 123 for 
these stars lying generally in the range 0.25 to 0.35. It would be very interesting 
to carry out this analysis for stars of Population II, because they represent an 
earlier stellar generation, and also because the Davis neutrino experiment has 
shaken our faith in the theory of stars of Population I. Unfortunately, there are 
very few stars of Population II near the sun. One of them, ji Cassiopeiae A, belongs 
to a binary system whose separation has recently been measured by a very 
ingenious method. 124 The resulting mass value, together with the observed values 
for L and Z/X , does not agree with theory 1 2 5 for any value of Y, but fits best a 
low helium abundance, with Y < 0.05. However, the validity of this mass deter- 
mination has since been put into doubt. 1 25a 

(B) Direct Solar Measurements. There are a number of different methods for 
estimating the present helium abundance of the sun, which do not rest on any 
detailed theory of solar structure and evolution. Measurements of the ratio Y/Z 
in solar cosmic rays, 126 together with the spectroscopic determination of Z/X in 
the solar photosphere mentioned above, suggest a helium abundance 127 Y ~ 
0.20 to 0.26. During periods of quiet sun, the He 4 /H ratio in the solar wind suggests 
a value 1 28 Y about equal to 0.15, but the helium content of the solar wind roughly 
doubles during magnetic storms. 129 Unfortunately, the solar surface is too cool 
to allow a spectroscopic determination of Y, but a value of Y ~ 0.38 is suggested 
by observation of solar prominences. 130 

(C) Globular Clusters : Theory. As already mentioned in Section 15.3, the 
comparison of the numbers of stars in different regions of the Hertzsprung- Bussell 
diagrams of globular clusters with theory yields results for both the age and the 
initial helium abundance of these clusters. Iben 131 derives values of Y in the 
range of 0.24 to 0.33, corresponding to ages 18 x 10 9 years to 9 x 10 9 years. 
Comparison of the stellar pulsation theory of Christy 132 with the location of 
variable stars in the Hertzsprung- Russell diagrams of the globular clusters M3, 
M15, M92 yields Y ~ 0.26 to 0.32 for these clusters. 133 These studies should be 
given particular weight, because the globular clusters are believed to be among 
the first objects to condense out of the primordial gas of hydrogen and helium. 

(D) Stellar Spectra. Helium lines are visible in the photospheres of a large 
number of hot stars of both populations. In general, helium abundances appear to 
be high, 123 with V /X of order 0.4, and some stars are apparently superabundant 
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in helium. There are several classes of old stars that show anomalously weak 
helium lines, such as the blue Population II stars on the horizontal branch of the 
Hertzsprung-Russell diagram of globular clusters. 134 One peculiar star, 3 Cen- 
tauri A , has a low abundance of helium, most of which is in the form of the isotope 
He 3 ! Planetary nebulae 135 and novae 123 generally appear to be superabundant 
in helium. 

(E) Spectroscopy of Interstellar Matter. Optical frequency emission lines 
from H II regions in our galaxy yield helium -hydrogen number ratios 123 that are 
consistently in the range 0.10-0.14, corresponding to a helium abundance by weight 
Y ~ 0.27-0.36. It is also possible to observe the recombination of ionized helium 
at radio frequencies; 136 the radiation emitted in a transition n + \ n has a 
wavelength proportional to w 3 for n 1, so that transitions with n ~ 100 have 
wavelengths of the order of centimeters. The helium-hydrogen number ratios 123 
deduced from radio observations of interstellar matter range from 0.06 to 0.16, 
corresponding to 7 ~ 0.14 to Y ~ 0.40. 

(F) Extragalactic Measurements. The emission lines of helium observed 123 
in H II regions in galaxies within and without our local group indicate a helium 
abundance similar to that of the H II regions in our own galaxy. On the other 
hand, quasi-stellar sources show remarkably weak helium lines. 123 

There is clearly a good deal of evidence for a universal helium abundance by 
weight not too different from the predicted value of 27%. Unfortunately, there is 
also a large body of evidence for a much smaller helium abundance. The clarifica- 
tion of this problem would be of the highest importance for cosmology, because the 
cosmologically produced helium, together with the 2.7°K radiation background, 
may be the only relics of the primordial fireball that can serve as clues to the early 
history of the universe. 

In order to keep an open mind about the synthesis of the elements in the early 
universe, it is useful to consider the possible modifications in physical or astro- 
physical theory that could affect the production of helium : 

(A) Cool Models. If the observed microwave background proves not to be 
black-body radiation left over from the early universe, then we would have to 
face the possibility that the true present black-body temperature T y0 is very much 
less than 2.7°K. In this case, the baryon number density at any given past temper- 
ature could be much greater than assumed above, with a consequent increase in the 
rate of nuclear reactions and in the abundance of complex nuclei produced in the 
early universe. Indeed, it was the high helium abundance produced in such cool 
models that led Gamow and his collaborators 51 to suggest the presence of a hot 
radiation background. 

(B) Fast or Slow Models. Various mechanisms might increase or decrease 
the expansion rate. In particular, if the universe contained a thermal distribution 
of additional massless quanta, such as gravitons, Brans-Dicke scalar particles, or 
new kinds of neutrinos, then the energy density at a given temperature would be 
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greater, and so, according to Eq. (15.6.44), the time required to reach that tem- 
perature would be shorter. The rate (15.7.33) of deuteron production per free 
neutron is normally larger than the expansion rate at T — 10 9o K by a factor 10 
to 10 3 (for a present density of 10“ 31 g/cm 3 to 10“ 29 g/cm 3 ), so for a moderate 
shortening of the time scale there would still be plenty of time for nucleosynthesis 
to occur at T ~ 10 9o K. In this case, the only effect of the faster expansion would 
be to cut down the time available for the conversion of neutrons into protons, so 
that the neutron fraction at 10 9o K would be closer to its initial value of and 
more helium would be produced. However, if the time scale were extremely short, 
there would not be time for complex nuclei to be formed before the density (and, 
for He 3 and He 4 formation, the temperature) falls too low. The detailed calcula- 
tions of Peebles 1 37 show that for T y0 = 3°K and a present density of 7 x 10“ 31 
g/cm 3 to 1.8 x 10“ 29 g/cm 3 , the He 4 abundance rises as the time scale is short- 
ened until it reaches a maximum of 60% to 80% (by weight) for a time scale 
shortened by a factor 10“ 1 to 10“ 2 , and then falls off again. The deuteron abun- 
dance continues to rise with shortening time scale, reaching a maximum of about 
9% (by weight) when the time scale is shortened by a factor 3 x 10“ 3 to 3 x 
10“ 4 , and then falls off again. On the other hand, if the expansion time scale were 
somehow lengthened, the only effect would be that more neutrons would decay into 
protons before nucleosynthesis occurs, so that less helium would be produced. 

(C) Neutrino -Electron Interactions. The thermal history of the early universe 
was worked out in the last section under the assumption that both electron- and 
muon-type neutrinos lose thermal contact with the e + — e~ — y plasma before 
the onset of e + — e~ annihilation. This assumption is probably valid if the 
neutrino -electron scattering is produced by the usual Fermi weak interaction 
with the same strength as in nuclear beta decay or muon decay. However, the 
neutrino -electron interaction has not yet been measured experimentally, and it 
could be somewhat stronger than expected. 138 In this case, the v e and v e (and 
possibly also the and v^) might remain in thermal equilibrium with the e + -e~ - y 
plasma until nearly all e + -e“ pairs have annihilated. The effect would be to in- 
crease the energy density at any given temperature, and also to eliminate the 
distinction between T v and T in the rates X (n -► p) and X{p -*• n). Detailed calcula- 
tions 139 show that if electron-type neutrinos remain in thermal equilibrium until 
helium synthesis, then the abundance of cosmologically produced helium would be 
about 29% instead of 27%. 

(D) Neutrino or Antineutrino Degeneracy. It is also interesting to consider 
the effect of a v e or v e degeneracy on the helium abundance. One effect is that the 
increased density would speed up the expansion. In addition, the imbalance 
between neutrinos and antineutrinos would affect the relative abundance of 
protons and neutrons. The difference between the neutron and proton chemical 
potentials in thermal equilibrium is given by Eq. (15.6.4): 


Tn - Vp = He- - Vv e 
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We saw in the last section that, during the period of interest, \i e _ is required to be 
negligible to maintain charge neutrality, whereas jn v JkT is a constant v (with 
|v| < 45): 

He- - 0 He. - vtcT 

The equilibrium neutron fraction is then given by Eq. (15.7.19) as 

1 + exp (v + — ] 

V w)\ 

where Q = m n — m p . Thus, if v were large and positive, the neutron fraction would 
start small and remain small, so that very little nucleosynthesis would occur. If v 
were moderately negative, say v « —1, then the initial neutron fraction would 
be high, so that after some neutrons were converted to protons, the neutron 
fraction at the onset of nucleosynthesis could be close to the optimum value of 
50%, and essentially ail the matter of the universe could be converted to helium. 
If v were large and negative, then the initial neutron abundance would be ex- 
tremely high, and no nucleosynthesis could occur until some neutrons could decay 
into protons, at which time the nucleon density would have been too low to allow 
much synthesis of complex nuclei. Detailed calculations of the abundance of H 2 , 
He 3 , He 4 , and Li 7 as functions of v have been carried out by Wagoner, Fowler, 
and Hoyle, 140 taking into account the effects of the neutrino or antineutrino 
degeneracy on the rates (15.7.2)-(15.7.7). These calculations show that if the 
“missing mass” consists of degenerate neutrinos or antineutrinos with |v| ~ 30, 
then the cosmologically produced abundance (by weight) of helium would be 
considerably less than 1%. On the other hand, if the lepton number density N E 
of the universe is of the same order as the baryon number density N B , then (15.6.52) 
shows that |v| is of order 1 hr. or about 10 9 . [See Eq. (15.5.15).] In this case the 
slight excess of neutrinos or antineutrinos has no appreciable effect on the 
synthesis of helium. 

One final warning: Even if a large cosmic abundance of helium is definitely 
established, it would not necessarily follow that this helium was formed in the 
early universe. Geoffrey Burbidge 141 has particularly emphasized the possibility 
that helium could have been synthesized in an earlier, more luminous phase of our 
galaxy’s history, perhaps in massive galactic objects. A good part of the calcula- 
tions discussed in this chapter would also apply to nucleosynthesis in the collapse 
of massive stars. 142 


= 


n p + n„ 


8 The Formation of Galaxies 

In the last two sections we have considered two constituents of the present 
universe — helium and the microwave background — which may be relies of an 
earlier era of cosmic history. Looking at the night sky, we see one other possible 
relic — the clumping of stars into clusters, galaxies, and clusters of galaxies. It is 
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natural to interpret this clumping as the effect of gravitational attraction acting 
on initially uniform diffuse matter, as first suggested by Newton in a famous 
letter to Dr. Richard Bentley. 143 Unfortunately, we still do not have even a 
tentative quantitative theory of the formation of galaxies, anywhere near so 
complete and plausible as our theories of the origin of the cosmic abundance of 
helium or the microwave background. 

The first serious theory of galaxy formation was proposed by Sir J ames Jeans 
early in this century. 144 Jeans supposed the universe to be filled with a non- 
relativistic fluid, with mass density p, pressure p, velocity v, and gravitational 
field g, governed by the equation of continuity 

8 -£ + v • (py) = o 

dt 

the Euler equation 

dv 1 

+ (v • v)v = — yp + g 
dt p 

and the gravitational field equations 

V x g = 0 (15.8.3) 

V • g = — 47tGp (15.8.4) 

The effects of gravitation were ignored in the unperturbed “solution/’ taken to 
be that for a static uniform fluid : 


(15.8.1) 


(15.8.2) 


p — constant p — constant v — 0 

If we add small perturbations p 1? p v v x , g l9 then to first order Eqs. (15.8.1)- 
(15.8.4) become 


8 _£x 

dt 


+ pV • Vl 


= 0 


dt 


v 2 

V Pl + g! 

P 


v X gl = 0 


V • gi = -4 t zG Pl 


where v s 2 is the speed of sound, 


2 Pi 
V = — 


Pi 


PPJ adiabatic 
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and all quantities that do not carry a subscript “1” are now understood to refer 
to the unperturbed “solution.” Combining these equations gives a differential 
equation for p l : 

8 -£± = v s 2 V 2 pi + 4; iGpp l 
dt 2 

The solution takes the form 


p 1 cc exp {ik • x — icot} (15.8.5) 

with co and k related by the “dispersion relation” 

co 2 = k 2 v s 2 — 4 nGp (15.8.6) 

This result bears a very close resemblance to the dispersion relation for longitudinal 
electrostatic oscillations in a plasma, 49 


co 2 — k 2 r s 2 + 


4; 


m. 


(15.8.7) 


where e, m e , and n e are the (unrationalized) charge, mass, and number density of 
electrons. The difference between (15.8.6) and (15.8.7) is that in (15.8.6), n e is 
replaced with the number density pjm, m e is replaced with m, e 2 is replaced with 
the Newtonian “coupling constant” Gm 2 , and an extra minus sign is inserted to 
take account of the attractive nature of gravitation. Because of the minus sign 
in (15.8.6), the “gravitostatic” waves exhibit an instability that is not present in 
plasma waves : co is imaginary for wave numbers below the critical value 


kj 


\nGpV !1 

. v . 2 / 


so that p i can grow (or decay) exponentially, with an e-folding rate 
Im co = v s (kj 2 — k 2 ) 1/2 fork 2 < kf 


(15.8.8) 


(15.8.9) 


Unfortunately, Jeans’s theory is not applicable to the formation of galaxies 
in an expanding universe, because Jeans assumed a static medium, whereas the 
rate of expansion of the universe is given by Eq. (15.1.20) in all cases of interest as 


R 

R 


'8 npGy 2 


(!) 1/2 V S 


(15.8.10) 


which is of the same order as the maximum value of the growth rate (15.8.9). The 
first satisfactory theory of the instabilities of an expanding universe was given in 
1946 by Lifshitz. 145 He showed that disturbances at wave numbers below kj 
grow, not exponentially, but like a power of t or of R(t). This result will be derived 
and discussed in detail below, using both the nonrelativistic treatment suggested 
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in 1957 by Bonner 146 (Section 15.9) and a simplified version of Lifshitz’s relativ- 
istic theory (Section 15.10). 

Although we are not yet in a position to determine the rates with which 
disturbances actually grow, we can rather easily decide which disturbances can 
grow and which cannot. For sufficiently large wave numbers, the waves described 
by Jeans’s theory become ordinary sound waves, with 

<o 2 = k\ 2 (15.8.11) 


What are the conditions for this simple dispersion relation to be valid? Gravita- 
tional forces will be negligible if the gravitational energy of a sphere of radius 
|kp 1 is much smaller than its thermal energy : 


0(p |k|- 3 ) 2 

ik r 1 


< pv s 2 l k l 3 


Also, the expansion of the universe will have negligible effect if the expansion rate 
is much less than the frequency: 

^Gp H 

Both of these conditions will be satisfied by the relation (15.8.11), as long as the 
wave number satisfies the condition 


|k| > hj 


just as in Jeans’s theory. Thus, even when the expansion of the universe is taken 
into account, we expect there to be a critical wave number, of order lc J} above 
which disturbances cannot grow, but only oscillate like sound waves. 

Since the expansion of the universe causes k to decrease as 1 /R, it is convenient 
to characterize disturbances by a constant, the rest-mass within a sphere of radius 
27r/|k|: 


M = 


4=nnm H 

3 



(15.8.12) 


where n is the hydrogen number density. According to the above analysis, the 
only growing disturbances are those with wave number less than kj, and hence 
with mass M greater than the Jeans mass 


4:7inm H f 2n\ 3 _ 4=nnm H f nv s 2 ^ 3/2 
3 [kj 3 VG[p + pi) 


(15.8.13) 


(It proves convenient here to replace p with p + p\ this is permissible, because 
M j is used only in order-of-magnitude arguments, and p is never more than p/3.) 
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We can gain a good deal of insight into the history of a protogalactic fluctuation 
by following the variations in M 3 caused by the expansion of the universe. (See 
Figure 15.6.) 

From the time of e + ~e~ annihilation (T ~ 10 10o K) until the time of re- 
combination of hydrogen ( T ~ 4000°K), it is a good approximation to take the 
contents of the universe as nonrelativistic ionized hydrogen plus black-body 
electromagnetic radiation, both in thermal equilibrium at temperature T. Also, 
since the photon entropy ak is so large, we may neglect the pressure, thermal 
energy, and entropy of the matter. The total energy density, pressure, and specific 
entropy are then (aside from the uncoupled neutrinos) 

p = nm H -p aT 4 

V = i®? 14 

4 aT 3 

(7 = 

3 nk 


(15.8.14) 

(15.8.15) 

(15.8.16) 


Unstable Acoustic Unstable 



T y (°K) 


Figure 15.6 Jeans mass as a function of 
radiation temperature. Solid line is for 
1 7 = 0.8 x 10°, corresponding to T y0 = 
2.7°K ,p Q = 3 x 10 -29 g/cm 3 . Dashed line is 
for g — 2.4 x 10 9 , corresponding to T 0 = 
2.7°K, p Q — 10“ 30 g/cm 3 . The drop in 
Jeans mass at recombination is somewhat 
more gradual than shown here. 
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In an adiabatic disturbance a is constant, so n varies as T 3 , and hence 


dp = [ 3 nm H + 4 aT 4 ] ~- 

hT 

Sp = [f «T 4 ] — 

The speed of sound is therefore 

„ s 2 = ( < V \ = 1 ( kT ° 

\^P / adiabatic ^ \^H kPfJ 

and the Jeans mass (15.8.13) has the value 

ttt 2n 5l2 k 2 a 2 

yf — 

J 9a 1/2 mJ(r 3/2 (l + okTjm H y 
or, in terms of the solar mass, 

Mj = 9MM 0 a z ( 1 + 


(15.8.17) 


(15.8.18) 


(15.8.19) 


Once the hydrogen recombines at T R ~ 4000°K, the radiation pressure 
becomes ineffective, and the equations of state are those for a monatomic ideal 
gas with y — 5/3: 

p = nm H + § nkT (15.8.20) 

p = nkT (15.8.21) 


The speed of sound here has the familiar value 

2 5 kT 

v = - — 

3 m H 


(15.8.22) 


and the Jeans mass (15.8.13) has the value 


M, 


n\ 512 /'5kTV? 2 

G j 


1/2 


m, 


(15.8.23) 


Just after recombination, the matter temperature T is the same as the radiation 
temperature, so T can be expressed in terms of n and the specific photon entropy 
(15.8.16) ; this gives 


27l 5/2 5 3/2 fc 2 (T 1/2 
9 a^ 2 m H 2 ~¥G 


102iU o u 1/2 


(15.8.24) 


As long as no additional heat is put into the gas, its temperature will drop as 
R~ 2 [see Eq. (15.5.16)], and since n drops as R~ 3 , the Jeans mass (15.8.23) will 
decrease as J? _3/2 . 
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We can now see how profound an effect the black-body radiation has on the 
growth of fluctuations. 147 As emphasized in Section 15.5, the present microwave 
temperature 2.7°K indicates that <7 is very large, of order 10 8 to 10 9 . In conse- 
quence, the Jeans mass (15.8.19) starts at T — 10 9o K at a very low value 
10 13 df o /cr, of order \0 5 M o to 10 4 Jf o ; then rises like T ~ 3 until T reaches a 
temperature m H jlca, of order 10 5o K to 10 4o K; then levels off at a very high value 
9 a 2 M 0 , of order 10 17 M 0 to 10 19 M 0 until the recombination of hydrogen; 
following this, M j drops precipitously to the value (15.8.24), of order 10 6 M Q to 
3 x 10 6 ilf Q ; and it continues to drop as R~ 3/2 thereafter. If we fix our attention 
on a particular fluctuation whose mass M has the value M G ~ \0 ll M o of a 
typical good-sized present galaxy, we may distinguish three distinct phases in its 
growth : 

(A) The Jeans mass (15.8.19) will be less than the galactic mass until the 
temperature drops to the value 

= /5VdV / 3 ~ 10 7o K (15.8.25) 

\gM g ) h 

During this period, the amplitude of the fluctuation will have a chance to grow 
under the influence of its own gravitation. Since the total energy density is 
dominated in this early phase by radiation, this is a relativistic problem, and the 
growth rate must be calculated in a general-relativistic formalism. It is shown in 
Section 15.10 that the fastest-growing normal modes have a density contrast 
dp Ip that grows as t. 

(B) From the time when T drops below the value (15.8.25) until the re- 
combination of hydrogen at T ~ 4000°K, the Jeans mass will be larger than the 
galactic mass, so the protogalactic disturbance will behave like a packet of ordinary 
sound waves. No appreciable growth is possible during this phase. For a relatively 
high present density, say of order 3 x 10“ 29 g/cm 3 , there is a long period before 
recombination when akT < m H , so that the total energy density is dominated by 
the hydrogen rest mass, and the protogalactic sound waves can be treated by 
Newtonian mechanics (see Section 15.9). For a relatively low present density, say 
of order 10~ 30 g/cm 3 , we have okT > m H throughout virtually all of Phase B, 
so that a relativistic treatment is required. (See Section 15.10.) 

(C) From the time of recombination until the present, the Jeans mass will be 
much less than the galactic mass, so the fluctuation amplitude can again grow. 
The total energy density in this phase is dominated by hydrogen rest-mass, so 
this is a nonrelativistic problem, and the growth rate can be calculated by New- 
tonian methods. The density contrast Sp/p is shown in Section 15.9 to grow 
roughly as t 2/3 . 

There is one disappointing aspect of this general picture : Nothing so far gives 
any clue to the reason for the observed galactic mass distribution. (The Jeans 
mass just before recombination is very much larger than any galactic mass, 
whereas the Jeans mass just after recombination appears to be related to the mass 
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of a globular cluster 1 4 7a rather than a galaxy.) Such a clue has recently appeared 
in calculations 148 of the damping of protogalactic fluctuations while they are 
undergoing essentially acoustic oscillations in Phase B. Dissipation will be im- 
portant whenever some particle’s mean free time is too long to maintain perfect 
thermal equilibrium. For" photons, the chief collision mechanism during Phase B 
is scattering by nonrelativistic electrons, and the mean free time here is 

t = — (15.8.26) 

no T 

where o T is the Thomson cross-section 

o T = = 0.6652 x 10“ 24 cm 2 

3m. 2 


In contrast, the mean free time of an electron or a proton, taking into account 
only Coulomb collisions, will be of order 


' (kT\ 1 > 2 e 4 ' 
* ( mj (hTf 


which is shorter than z y by a factor of order (kTjm e ) 212 . Thus the dominant 
dissipative effects in Phase B will arise from the failure of perfect thermal equilib- 
rium between matter and radiation, rather than from any dissipation in the matter 
itself. Also, for any reasonable value of the present baryon number density, a 
fluctuation of mass 10 1 1 M Q will have a radius 2jc/|k| much larger than the photon 
mean free path z y throughout virtually all of Phase B, so the interaction between 
matter and radiation can be treated to first order in z y . In this approximation, 
the medium of protons, electrons, and photons behaves like an imperfect fluid 
(see Section 2.11), with coefficients of shear viscosity, bulk viscosity, and heat 
conduction given by 149 


V = T7 aT*T y 
f = 4aT 4 T.. 

X = I a-T 3 x y 



(15.8.27) 

(15.8.28) 

(15.8.29) 


In general, a sound wave in an imperfect fluid will be damped at a rate 149 
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(15.8.30) 



8 The Formation of Galaxies 


569 


Using Eqs. (15.8.14), (15.8.15), and ( 15.8.26)-( 15.8.29) in Eq. (15.8.30) gives the 
damping rate here as 

^ _ k 2 aT 4 C 16 n 2 m H 2 

6nG T [nm H + f aT 4 ] [15 aT 4 [nm H + f aT 4 ] 


(15.8.31) 


the two terms in the brackets representing the effects of shear viscosity and heat 
conduction, respectively. (The bulk viscosity (15.8.28) vanishes here, because 
with the matter pressure and thermal energy neglected, ( dp/dp) n is just ^-.) Since 
k 2 oc M~ 2 ' 3 , the protogalactic sound wave will be damped in amplitude by a 
factor of form 


D = exp 



(15.8.32) 


where M c is some critical mass (and a subscript R denotes the time of hydrogen 
recombination). For a relatively high present density, there was a long period 
before recombination when the energy density was dominated by hydrogen rest- 
mass, so that 

t ~ (671 nm H G)~ 1/2 
k 2 

r - - - oc t 2 ' 3 
6a T n 


and the critical mass in Eq. (15.8.32) is 
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c 


32ti * 
3 


10(7 T J 


(15.8.33) 


For instance, for a present mass density 3 x 10 -29 g/cm 3 , the critical mass is 
5 x 10 12 M q . For a relatively low present density, the energy density during 
virtually all of Phase B was dominated by radiation (including neutrinos), so that 


t ~ {l5.5naT 4 G)~ 1/2 



15(7 T n 


and the critical mass in Eq. (15.8.32) is 

M c =* ( i^aV 72 (i5. 5jro r4 G) - 3/4 (KrTOji) - 1/2 (15.8.34) 

3 \45o t J 


For instance, for a present mass density 10“ 30 g/cm 3 , the critical mass is 2 x 
10 14 ilf o . A fluctuation will presumably be unable to survive the damping in 
Phase B if the exponent (MJM) 2/3 in Eq. (15.8.32) is greater than about 10, so 
we can conclude that the fluctuations surviving at the time of recombination have 
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a minimum mass between 1.6 x l0 11 M o and 6 x IQ 12 M 0 , just about the mass 
of a large galaxy. 

So far, we have learned that any small fluctuation of mass between about 
10 11 df o an d 10 17 ilf o will grow in Phase A, undergo damped oscillations in 
Phase B, but will survive to trigger new growth in Phase C. One may still wonder 
why there is a fairly definite upper limit on galactic mass, of order 10 12 ilf o to 
10 13 -M o > rather than a smooth distribution of masses extending from lO 11 ./!^ 
up to much higher values. One possible answer lies in the well-known property of 
nonlinear effects, 1 5 0 of transferring energy from long wavelengths to the shortest 
wavelengths that can survive the effects of dissipation. Unfortunately, the applica- 
tion of the theory of turbulence to problems of galactic growth has just barely 
begun. 1 5 1 

The theory of the origins of galaxies is of more than academic interest, because 
the value ( Sn/n) R of the fractional change in density at the time of recombination 
may become accessible to observation in the near future. 152 For fluctuations that 
are approximately adiabatic, the number density is proportional to the cube of 
the temperature, so the temperature fluctuations at the onset of recombination 
will be given by 


' ST y \ _ 1 
T y J R 3 y fi j R 


(15.8.35) 


If the universe has remained optically thin since this time, then these fluc- 
tuations will show up in the cosmic microwave background, as fluctuations of 
the observed cosmic radiation temperature with angle. (Note, however, that 
Thomson scattering could wash out such inhomogeneities 1 5 2a without affecting the 
Planck shape of the distribution function; see Section 15.4.) According to Eqs. 
(15.5.35)-(15.5.37) and (15.8.12), a fluctuation of mass M will appear to have 
angular scale 

6/2 ~ q 0 H 0 ( 1 + Z R ) ( f* ) ~ q 0 H a ( 1 + z R ) ( -^-V /3 

Vl k liU \^n R m H J 

or, since n oc iU 3 , 

( 3M \ 1/3 

612 ~ q 0 H 0 {~ (15.8.36) 

\±nn 0 m H J 

For instance, for q 0 = H 0 ~ 1 = 13 x 10 9 years, and a present density n 0 m H ~ 

1.1 x 10~ 29 g/cm 3 , the fluctuation corresponding to a nascent galaxy of mass 
10 11 ilf o should have an angular scale 0 = 30". As indicated in Section 15.5, the 
measurement of even small fluctuations with this angular scale is well within the 
reach of present techniques. It is therefore a matter of some importance to calculate 
how strong a fluctuation has to be at the time of recombination to grow into a 
galaxy by the present time. This problem will be addressed in the next section. 
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9 Newtonian Theory of Small fluctuations 

We here calculate the behavior of small fluctuations, using the Newtonian 
equations (15.8.1)-(15.8.4), but now taking into account the expansion of the 
universe. As remarked at the end of Section 15.1, we can safely employ Newtonian 
mechanics to deal with astronomical problems in which the energy density is 
dominated by nonrelativistic particles, so that p p, and in which the linear 
scales involved are small compared with the characteristic scale of the universe. 1 5 3 

As shown in Section 15.1, there is a simple spatially uniform solution of Eqs. 
(15.8. 1)-(15. 8.4), with 


r t 

p Po UwJ 


(15.9.1) 

v = ,OT 

L*(oJ 


(15.9.2) 

i i 

i i 

Si 

II 

fcJC 


(15.9.3) 

where R(t) is a scale factor satisfying the differential equation 


it 2 + k = 8npGR 2 

3 

(15.9.4) 

or equivalently 



^ 1 

ii 

=«; 1 0^ 


(15.9.5) 

We now seek a perturbed solution, by adding to the “zero-order” solution 
(15.9. 1)— (15. 9.3) the small perturbations p lf v l5 and g x . The hydrodynamic 
equations ( 15.8. l)-( 15.8.4) then give, to first order in these perturbations, 

R R 

Pi + 3 R Pl + R (r ' V)Pl + pV ' Vl = ° 

(15.9.6) 

R R , V7X 

v i + R Vl + £ (r ' V)v ‘ = ~ 

-- ^Pi + g, 

P 

(15.9.7) 

V x g, = 0 


(15.9.8) 

< 

2? 

II 

1 

a 

y 


(15.9,9) 


Also, since these are for the moment supposed to be adiabatic fluctuations, the 
pressure perturbation is given by 

Pl = V s 2 Pl 


(15.9.10) 
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where v s is the speed of sound. (Here and below, p denotes the mass density 
(15.9.1) in the unperturbed solution.) 

The equations (15.9.6)-(15.9.10) are spatially homogeneous, so we expect to 
find plane- wave solutions. Indeed, solutions can be found with the spatial 
dependence 

Pi{r,t) = Pi(t) exp — -1 n/vorn 


(15.9.11) 


and likewise for v A and g x . (The appearance of the factor l/R in the exponential 
means that the wavelength of these modes is stretched out by the expansion of 
the universe, as anticipated in the last section.) Equations (15.9.6)-(15.9.10) now 
become coupled ordinary differential equations : 


. O/l 1 ,, 

Pi + Pi + lR P9 * v i = 0 


(15.9.12) 


P any ^ 

*1 + - v, = --r-qpi + gi (15.9.13) 

K pK 

q x gi = 0 (15.9.14) 

iq-gi = -4 nGR Pl (15.9.15) 

The ‘‘field equations” (15.9.14) and (15.9.15) have the obvious solution 


4:niGpiR({ 


(15.9.16) 


To solve the equations of motion, it is convenient to decompose v x into parts 
perpendicular and parallel to q, 


where 


v il^ = v i f 1 * 1 ^*) 


q - v lx = 0 


(15.9.17) 


It is also convenient to express p t in terms of a fractional change in density 5 : 


Pl(t) = p(t)&(t) = Po ^7°- S(t) 


Ti 0 

_R(t)_ 


(15.9.18) 


Then (15.9.13) splits into two uncoupled equations 


* -LV 

v n + -' r ii = 0 

R 


(15.9.19) 


R 

£ H £ = 

R 


V 2 + 4nGpR \ ^ 


15.9.20) 



9 Newtonian Theory oj Small Fluctuations 


573 


and (15.9.12) simplifies to 



(15.9.21) 


Inspection of Eqs. (15.9. 19)-(15. 9.21) shows that there are two quite different 
types of normal mode here. The rotational modes, described by v 1 ± , simply decay 
as 1 /It: 

y 1± {t) oc iT ^f) (15.9.22) 

On the other hand, the compressional modes, described by s and <5, have a more 
interesting time dependence. Using Eq. (15.9.21) to eliminate s in Eq. (15.9.20), 
we find 

<5 + ^ S + ( _ i%Op \ 5 = 0 (15.9.23) 

a \ a ~ j 


Note that this goes over to the Jeans dispersion relation (15.8.6) if we set R 
constant and define the wave number k as q/isf Equation (15.9.23) is the funda- 
mental differential equation that governs the growth or decay of gravitational 
condensations in an expanding universe. 

The above Newtonian theory becomes applicable at the onset of the matter- 
dominated era, when the energy density of radiation drops below the rest-mass 
density, so that p 4 p- Unfortunately, Eq. (15.9.23) is a little too complicated to 
allow a solution in closed form that would be valid throughout the whole of the 
matter-dominated era. We can, however, answer the most interesting questions 
about the behavior of d(t) by solving Eq. (15.9.23) in a number of special cases. 


(A) Zero Pressure Solutions 

According to the general picture outlined in the last section, a galaxy is 
supposed to grow out of the small fluctuations present at the time of hydrogen 
recombination, which are left over from the previous phase of damped acoustic 
oscillation. The most important problem facing us is how much such a proto- 
galactic fluctuation could have grown from the time of recombination until the 
present. Or, to put this in terms relevant to observations, how large would a 
fluctuation have to be at the time of recombination to have a chance of growing 
into a galaxy by the present time ? 

To answer these questions we can simplify Eq. (15.9.23) by neglecting the 
pressure term v s 2 q 2 /R 2 . This term will be negligible compared with the gravita- 
tional term 4tnGp if the wave number |k| = |q|/f? is much less than the Jeans 
wave number (15.8.8) or, equivalently, if the fluctuation mass (15.8.12) is much 
greater than the Jeans mass (15.8.13). We saw in the last section that the Jeans 
mass is of order 10 6 M o to 3 x i0 6 J/ o immediately after recombination, and 
drops as R~ 312 thereafter, so a galactic mass of order 10 11 ilf o will certainly be 
much greater than the Jeans mass once the hydrogen recombination is over. 
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In order to carry the solution of Eq. (15.9.23) forward to the present time, it 
will be necessary to use the parametric formulas for Bit) and pit) derived in 
Section 15.3. For positive, zero, and negative curvature, these are: 


+ 1 


B(t) 

Bn 


= ?o( 2 ?o - 1) A (! - cos 9) 


H 0 t — </o( 2 </o ~ 1) 3//2 (^ " sin 0) 
SH 0 2 {2q 0 - l) 3 


P = 


4:nGq 0 2 {l — cos 0) 


k = 0 


B{t) 


B, 


3 n 0 A 2/3 


2\-l 


k = -1 


m 

Bn 


n = (OnGt ) 


SoC - 2 ?o) '(cos 11 ¥ - 1) 


H 0 t = ? 0 (1 - 2q 0 )~ 3/2 (sinh T - V) 

„ = 3ff 0 2 (l - 2 g 0 ) 3 

47iG ! g l 0 2 (cosh *P — l) 3 

With the pressure term neglected, the differential equation (15.9.23) now takes the 
following forms. 


k = +1 


k = 0 


(1 — cos 0) ^-4 + sin 6 — — 35 = 0 
dB 2 dO 


5 + -5 

st 


3 1 : 


5 = 0 


k = 


(cosh ¥ ~ 1) + sinh ¥ — 

d y 2 d x F 


35 = 0 


(15.9.24) 


(15.9.25) 


(15.9.26) 


In each case there are two independent solutions, which we shall call <5 + and 5_ : 

k = +1 

30 sin 0 5 + cos 6 

5 + oc -- — + (15.9.27) 


5~ or 


(1 — cos 0) 2 1 — cos 6 

sin 6 


(1 — cos 0)' 


(15.9.28) 
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k = 0 

<5 + oc * 2 ' 3 
s_ oc r 1 

k = -l 

3T sinh T 5 + cosh T 

o + oc 

(cosh *P — l) 2 cosh T — 1 

s sinh 'P 

O- oc 

(cosh T - l) 2 


(15.9.29) 

(15.9.30) 

(15.9.31) 

(15.9.32) 


In all three cases, the solutions <5+ and <5_ go over to t 2/3 and £ _1 , respectively, 
for R{t) < R 0 , so that if a disturbance starts with comparable amplitudes for the 
<5+ and <5_ modes at recombination, then it will soon be almost purely in the <5 + 
mode. We therefore concentrate entirely on the S + mode from now on. 

The disturbance is supposed here to start at a time t R corresponding to the 
large red shift 


1 + Z R 


R(t 0 ) ~ 4000°K 
R(t R ) ~ 2.7°K 


- 1500 


The initial values of the parameters 6, t, or 'P for k = +1, k = 0, or h — — 1 
are then 

11/2 


o K . pL^°>J 

*R ~ W + Z r )~ V2 

2(cosh To - 1)T /2 


>F r ~ 


ish Tq - 1) : 

1 + Z R 


Hence the density contrast 3 could at most grow by an amplification factor 

<5 + ft)) 


A 0 = 




(15.9.33) 


given by 


^0 = < 


— Zr ^ ■ { — 3# 0 sin 0 O + (1 — cos 0 O )(5 + cos 6 0 )} (k = +1) 

(1 — cos o 0 y 

(1 + z R ) {k — 0) 

— 5(1 + Zr) - {-3T 0 sinh ‘To + (cosh T 0 - l)(cosh T 0 + 5)} 

(cosh T 0 - 1 ) J 


(1c = -1) (15.9.34) 
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The parameters 0 Q and can be expressed in terms of q 0 by using Eq. (15.3.10) 
or (15.3.19). We then find that A 0 is a monotonically increasing function of q 0 for 
q 0 > 0, rising from A 0 ~ 5q 0 ( 1 + z R ) for q 0 <4 j, to A 0 = 1 + z R at q 0 = 
to A 0 = 1.45(1 + z R ) at q 0 = 1, and approaching the upper bound A 0 — 5(1 + 
z R ) for q 0 §> 1. With q 0 between 0.014 and 2 and 1 -f z R = 1500, a small fluctua- 
tion would have grown from the time of hydrogen recombination until the present 
by a factor A 0 between 100 and 3000. 

The condensations observed in the present universe cannot be considered 
£ ‘small disturbances.” For instance, the mass density in a typical cluster of galaxies 
is of order 10“ 28 g/cm 3 , an order of magnitude greater than the maximum likely 
value for the universe as a whole, and the mass density within a galaxy is of course 
even greater. The simple linear instability theory described above is therefore not 
applicable to the whole history of inhomogeneities up to the present moment. 
However, it seems reasonable to suppose that the present strong condensations 
grew out of small disturbances, so that a necessary (if perhaps not sufficient) 
condition for their formation is that the perturbation <5 + (£) calculated in linear 
stability theory should have become of order unity at some time before the present. 
This then sets a lower limit on the magnitude of the initial disturbance at the time 
of recombination, 

l<5 + («l £ — (15.9.35) 

so that |<5 + (f K )| should be greater than about 10“ 2 to 3 x 10“ 4 , depending on the 
value of q 0 . In order to say how much greater, it would be necessary to know the 
time at which the disturbance reached the beginning of the nonlinear regime, with 
|<5 + (£)| 1. According to Weymann, 154 the observed binding energies of galaxies 

indicates that this must have occurred after a time of order 7 x 10 7 years; if 8 + 
reached unity this early, then it must have been already quite large at the time of 
recombination. On the other hand, if the concentration of quasi-stellar object red 
shifts at 2 « 2 marks the onset of galaxy formation, then (15.9.35) provides a 
reasonably good estimate of the actual value of |<5 + (^ jR )|. As remarked at the end 
of the last section, the protogalactic fluctuations at the time of recombination 
would produce fractional fluctuations in the observed microwave background 
temperature equal to \d + (t R )\IS over angles of order 30", provided that Thomson 
scattering during or after recombination does not smooth out the fluctuations. 1 52a 
Even if the nonlinear regime was reached rather recently, A T y jT y would be of 
order 3 x 10 -3 tol0 -4 , which should be observable. 

(B) Zero Curvature Solutions 

It is also interesting to consider the behavior of the solutions when the pressure 
term v s 2 q 2 j‘R z in Eq. (15.9.23) is not neglected, particularly in order to locate the 
precise dividing line between stable and unstable fluctuations. In order to obtain 
exact solutions, it is necessary now to restrict our attention to early times, when 
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R(t) R 0 , so that the terms M 2 and SnpGR 2 jZ in Eq. (15.9.4) are much larger 
than unity, and even for k ~ ±1 we can use the k — 0 formulas for R and p : 

R oc t 2 ‘ 3 (15.9.36) 

p = {ibnGt 2 )- 1 (15.9.37) 


(This is not much of a limitation, because for recent times, when these formulas 
may not be valid, the Jeans mass is so small that its precise value is of little 
interest.) For a general specific heat ratio y, the pressure varies as p y , and the 
speed of sound is 


v r — 


my oc pi/2(r-D octi-r 


(15.9.38) 


Hence Eq. (15.9.23) reads here 


s + i s + (-^- 

3 1 If 2 ’- 2 ' 3 



(15.9.39) 


where A 2 is the constant 


A 2 



(15.9.40) 


The solutions of Eq. (15.9.39) for y > 4/3 are 


<5 


cc 




T 5/6v 



where J is the usual Bessel function, and 


v 



> 0 


(15.9.41) 


(15.9.42) 


The Bessel functions oscillate for t <4 A 1/v , whereas for t > A 1/v the solutions 
behave like 


s ± oc r 1/615/6 


(15.9.43) 


By using Eqs. (15.9.37) and (15.9.40), the condition t > A 1/v for growth in the 
<5 + mode can be written 


V q 

R 2 


2 


> 6nGp 


which is substantially the same as the Jeans condition ^ 2 k 2 > AnGp. 

The solutions (15.9.41) will apply after recombination, with y ~ 5'3. Also, 
for a relatively high present density, there is an appreciable period before re- 
combination when the total energy density is dominated by hydrogen rest -mass, 
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but the pressure is dominated by radiation. The cosmic medium then behaves 
like a nonrelativistic fluid with y — 4/3, so that Eq. (15.9.39) reads 

with 

A 2 _ = VsW 

R 2 QnGpR 2 

The solutions of Eq. (15.9.44) are quite simple: 

<5 ± oc t* a = — £ ± (ff — A 2 ) 1/2 (15.9.46) 

Both solutions undergo a gently damped oscillation for A > 5/6, and decay for 
5/6 > A > >/2/3, whereas for A < Va/3, <5 + grows and <5_ decays. The con- 
dition for growth in the <5 + mode is then 

v 2 q 2 2 
QnGpR 2 < 3 

which is precisely the same as Jeans’s condition v s 2 k 2 < 4:nGp. 

It is not difficult to incorporate dissipative effects in this formalism. The most 
interesting case here is the damping owing to a finite photon mean free path during 
the matter-dominated part of “Phase B,” that is, during the period prior to 
recombination when the radiation density uT 4 is much less than the matter 
density nm H and the Jeans mass is much greater than a galactic mass. According 
to Eqs. (15.8.27)-(15.8.30), the effects of viscosity are negligible here compared 
with the effects of heat conduction, so dissipative effects can be taken into 
account 1 55 by using the equation of state to express the pressure perturbation p t 
in Eq. (15.9.7) in terms of the temperature and mass-density perturbations T t 
and p lf and supplementing Eqs. (15.9.6)-(15.9.9) with the usual nonrelativistic 
equation of radiative heat transfer. We need not go into this further here, since 
heat conduction as well as viscosity will be incorporated into the general- 
relativistic theory described in the next section. 


(15.9.44) 

(15.9.45) 


10 General- Relativistic Theory of Small Fluctuations 

The nonrelativistic analysis presented in the last section is adequate for the 
study of compressional and rotational perturbations during the matter- dominated 
era, when p p. However, a relativistic treatment is needed to deal with the 
radiation or lepton-dominated eras, when p is of the same order as p, or to treat 
the propagation of gravitational radiation in any era. 

The relativistic theory of small disturbances in an expanding universe is 
rather complicated, so the theory presented here will deal only with the simplest 
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case, when the unperturbed Robertson- Walker metric has curvature k — 0. This 
is not too severe a restriction, because the results for h = + 1 or & = —1 are 
essentially the same as for k = 0, provided that we keep to the early universe 
when R 2 > |&|, and provided that we keep to perturbations of wavelength much 
less than R. These are the most interesting cases in any event, particularly since 
the growth of condensations in the recent past can be treated for any value of k 
by the nonrelativistic theory of the last section. 

It proves convenient here to include the effects of dissipation from the be- 
ginning. The medium will be characterized by a coefficient of shear viscosity rj and 
a heat conduction coefficient as mentioned in Section 15.8, the bulk viscosity f 
will be negligible once the pressure and kinetic energy density of matter drop 
below those of radiation. The effects of dissipation can be taken into account by 
adding suitable terms to the energy- momentum tensor. These terms were calcu- 
lated in Section 2.11 for a relativistic imperfect fluid in the absence of gravitation; 
the correct energy-momentum tensor in the presence of gravitation can be im- 
mediately obtained by writing Eq. (2.11.21) in a generally covariant form: 

T» v = pg^ + (p + p)U p U v - i}H^H va W pa 

- x(H ftp U v + H vp U fl )Q p (15.10.1) 

where 

W„ r = U„ + (7 >; „ - %g tlv U y . y (15.10.2) 

Q„ = T v + TU,„U' (15.10.3) 

+ U„U V (15.10.4) 

It is easy to check that the dissipative terms in T* v vanish for a Robertson- 

Walker metric (with any k) t so the Friedmann solutions still provide our starting 
point. In particular, for k = 0 the Einstein equations have the familiar un- 
perturbed solution 


9tt = ~ l 9ti = 0 9ij = ^ 2 (*)<5 (7 
U f = 1 U l = 0 


(15.10.5) 

(15.10.6) 


where x % (with i = 1, 2, 3) are a set of quasi-Euclidean comoving coordinates, 
and 


^ 2 SnpGR z 

R = 

3 


(15.10.7) 


The only independent nonvanishing components of the unperturbed affine con- 
nection are then given by Eqs. (15.1.3)— (15.1 .5) as 
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We shall consider a disturbance, in which the metric is g^ v + h ^ v , with h v 
small. Before we write down the field equations for h, lv> it is useful to recall the 
remark of Section 10.9, that by performing a coordinate transformation (10.9.6), 
we can convert the solution h into an equivalent solution 

hftv = ^fiv + £ n;v + £ v,n 

with an arbitrary small vector field. Using Eq. (15.10.8), this new solution takes 
the form 

pjp Pj p 

hy = hj + Tj + J 1 i~ (15.10.9) 

dx J dx l 

K = - 2 f (15.10.10) 

dt dx l R 

h* = h tt + 2 — ‘ (15.10.11) 

dt 

It is extremely convenient to choose so that 

K = K = 0 


thus maintaining to the greatest possible extent the form of the unperturbed 
metric (15.10.5). This can be accomplished by constructing according to the 
prescriptions 

Ef = — dt 




R~ 2 dt 


Dropping the asterisk, it will simply be assumed from now on that the coordinate 
system is chosen so that 

h it = h tt = 0 (15.10.12) 


The perturbation Sg^ v = h^ v produces in the affine connection a perturbation 
(10.9.1), which here has the components 


SV Jt = 

1 

2R 1 

lA j \ k 

+ h ik;j 

~ hjk;il 



1 

[Shy 

dh ik 

+ 


(15.10.13) 


2 B 2 

_dx k 

dx J 

dx l J 

50* = 

— } [h t j;k 

+ h t k;j 

“ hjkitl 



Idhj, 

2 dt 


(15.10.14) 
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— 2 Rl + hij;t h t j ;i ] 


1 

2R 2 


8h i} 2 R 


dt R 

sr ti = sr tt = dv tt = o 

Contracting, we find also that 


1 

r 


STp = sr v vfl 


— ,sr' = - ^ ^ kk 

* dx*\2R 2 


(15.10.15) 

(15.10.16) 


(15.10.17) 


it being understood that repeated Latin indices are summed over the values 1, 2, 3. 
The perturbation in the Ricci tensor is then given by 

3R U = (<5r { ) :J - - (3T %)„ 

_ cSTj _ d3Tjj _ dSTjj 
cx J dx k dt 


- RR3 ij 3 r, 


+ sria 


sb u = (^r,) ;I - (^r<5) ai 


cdT, _ djrii 

dx l cx j 


<5r, + - STj 
R 3 


8B„ = (<5r,) ; , - (5rs)„, 


c5r, , 2S 5r| _ 

ct R 


or, more explicitly, 


5R ti 


\ 2 h, 
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2R 1 
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1 d 2 h 


d 2 h ik 
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d 2 h 


kk 


2 Ct 2 

+ R 2 


dx j dx k dx l dx k ' dx l dx J 

A 

2R 


+ ~ L hu ~ Sifikk 


R : 


[ — 2h ij + 3 u h kk ] 
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(15.10.18) 

(15.10.19) 


(15.10.20) 
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According to Eq. (15.10.1), the source term on the right-hand side of Einstein’s 
field equations is 

= T', - i g^T\ = Up - p) 9llv + (p+ P)U,U, 

- - yXUJJ-, + U^U.W (15.10.21) 

In order to preserve the normalization of the velocity U, we must have 

o = % MV cw) = -2C7 

The perturbations A fj ., f7 li? p 1 , p lf and T 1 then produce in S the perturbations 


sSu = Up - p)hj + ^Uy(Pi - Pi) - nn*sw‘ J (15.10.22) 

5S U = -R\p + p)U x ' - xf8H it + xR 2 dQ l (15.10.23) 

SS„ = i (Pi + 3 Pl ) - 2 X fdH u (15.10.24) 

where 

8H it = — R 2 V 1 ' SH„ = 0 (15.10.25) 


+ X~ 2 | {IT 2 lhi ~ ¥,A k ]} (15.10.26) 

SQ‘ = R~ 2 (^-J + T ^ RlV ^ (15.10.27) 

Finally, the perturbed Einstein field equations here take the form 

= -SnOSS^ (15.10.28) 


Putting together Eqs. (15.10.18)— (15.10.20) and (15.10. 22)-(15. 10.28), we find 

V 2 A v = + ^hk_ _ 

dx J dx k dx l dx k dx l dx J 

+ TtR\hij — 5 ( jh kk ] + 2iv 2 [ — 2 Ji-j + dijh kk ] 

= —SnCr{p - p)R 2 h iJ - SnGR 4 S ij {p 1 - Pl ) 

+ 16 xAR* | [Ay - 


(15.10.29) 
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dt\R 2 \_dx‘ dx k Jj 


lQnGR 2 (p + p)U x l 


- MmGytR 2 UT 


2 R , „ / R 2 

hkk ~h Kk + \W 2 


16nG x ( d ^-i + T (R 2 U { 

\dx dt 


h kk = ~8nG{p i + 3 p t )E : 


(15.10.30) 

(15.10.31) 


The equations of motion for the fluid can be obtained either from the con- 
servation equations T ^ = 0, or directly from the field equations. By applying the 
operators d/dx l and d/dt + 3 E/R to Eqs. (15.10.29) and (15.10.30), and using 
(15.10.30), (15.10.31), and the trace of (15.10.29) to simplify the result, we obtain 
the momentum conservation equation 

[i + x# 

= - R + qR 3 [v 2 U l i + — 7-U, + 2 -0r 2 ^A 1 (15.10.32) 

dx l [ 3 dx l 3 dt \ dx l J J 

(The vector Uj is understood to have components t/* 1 , not U lf .) Also, by taking 
the divergence of Eq. (15.10.30) and using Eqs. (15.10.31) and the trace of Eq. 
(15.10.29), we obtain the energy conservation equation 

a + f (,,+<"> = -<p {*(&) + *•».} 


(15.10.33) 

As usual in dealing with dissipative processes, we must also make use of the con- 
servation law for the particle current nU 11 : 


0 = {nU»). u = JJ 11 ~~~ + nU * 

8x“ •“ 

(Strictly speaking, n should be taken as the density of baryon number or lepton 
numbers.) For the unperturbed solution, this gives the familiar result 

n oc R~ z 

and to first order in the perturbations n 1 , V 1 , and h tj , we have 

0 = v + v n > + W{V ' u > + 5r *> 

(jt it 
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or, using (15.10,17), 


dt l n 


-V 

dt \ 2 R 2 


(15.10.34) 


Equations (15.10.29)-(15.10.34) provide a convenient set of fundamental equations, 
but it should be kept in mind that these equations are not all independent, as 
shown by the way they were derived. 

One solution of these equations can be found immediately : 


hij(x, t) = R 2 (t) 


f d/iM + Sfjjx) 
) dx J dx 1 


Pl = Pl = V 1 = = T\ = 0 (15.10.35) 

with f an arbitrary function of position. [To verify Eq. (15.10.29), use Eq. (15.1,20).] 
However, reference to Eqs. (15.10.9)-(15.10.11) shows that this is not a physical 
disturbance, but represents the effects of an infinitesimal coordinate transformation 
of the form (10.9.6): 

X U -*■ X 11 — £ fl (x) 

8* = 0 e(x, t) = R 2 (t)i(x) (15.10.36) 


whose structure is such as to preserve the vanishing of h it and h tt . We are interested 
only in physical disturbances, whose form necessarily differs from (15.10.35). 

The manifest spatial homogeneity of equations (15.10.29)-(15.10.34) allows us 
to find solutions with the spatial dependence 

h ip p l9 p u U lf n u T x oc exp (iq • x) (15.10.37) 

with a a constant wave number. Just as in the nonrelativistic case, it is convenient 
to analyze the general solution into normal modes. Now there are modes of three 
different kinds : 


Radiative Modes 

There is a simple class of solutions with 


o = hk = 1i h u = Pi = Pi = u u = »1 = r, (15.10.38) 

Equations (15.10.30)-(15.10.34) are here trivially satisfied, and Eq. (15.10.29), 
together with Eq. (15.1.20), gives 


hij + 


R 

R 


+ IQnGrj 



2 R S2KGrjRl _ 

■ fa: : — 0 

R R \ ‘ 


(15.10.39) 
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For very large wave numbers, we can find a general second-order WKB solution 

±*|q| 


hu oc R exp 


— SnGti dt 
R 


(15,10.40) 


With R and r) slowly varying, this result may be converted from the comoving 
spatial coordinate system to a nearly Minkowskian system by multiplying h i j by 
a scale factor \jR 2 . Thus (15.10.40) corresponds to a plane gravitational wave of 
the form (10.2.1), with 

e uv °c ~ ex P { — I q dt} 


R 


I 


k - 


R 


According to Eq. (10.3.7), the energy density t°° of these gravitational waves 
decreases as 


oc R 4 exp { — 16tzG y\ dt} 


/% 

n 

. 


(15.10.41) 


The factor iU 4 is just what we should expect for the free expansion of any wave 
representing a massless particle. [Compare Eq. (15.1.23).] The extra factor in 
(15.10.41) tells us that gravitational waves in a viscous medium are absorbed at a 


rate : 


56 


ItSnGrj 


(15.10.42) 


Generally, rj will be of the order of the thermal energy density times some typical 
mean free time t, so that F ff is at most of order R 2 t/R 2 . Hence as long as the 
collision rate z~ 1 is much greater than the expansion rate R/R, the damping rate 
T g will be much less than the expansion rate R/R, and viscosity will have little 
effect on the wave propagation. With viscosity neglected, and with R(t) assumed 
to have a power-law time dependence 


R(t) oc t n 

we can find a solution of Eq. (15.10.39) valid for all wave numbers, 

|q|* 


h u oc ^ 

- v ta-»W 


where J +v is the usual Bessel function of order ±v, and 


v = 


3 n 


2 — 2 n 


(15.10.43) 


(15.10.44) 


(15.10.45) 


(In the matter-dominated or radiation-dominated eras, Eq. (15.10.43) holds with 
n = 2/3 or n = 1/2, respectively.) Unlike the case of electromagnetic waves in a 
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plasma, there is no sharp lower limit to the frequencies at which gravitational 
waves can propagate; instead, the solutions start at \q\t R with the behavior 

oc t 2n or t l ~ n (15.10.46) 

and gradually go over for |q|£ > R to the wavelike solutions (15.10.40). 

Rotational Modes 

There is also a simple class of solutions with 

o = h kk = q • U t = Pl = Pl = = T, (15.10.47) 

Here Eqs. (15.10.31), (15.10.33), and (15.10.34) are trivially satisfied, and Eq. 
(15.10.32) becomes an equation for the transverse part of Uj : 

|"| + 16jt<?//"| [i? 5 {p + p - xTJU,] = -^VU, (15.10.48) 

Equations (15.10.29) and (15.10.30) then dictate the gravitational field produced 
by the rotations represented by ; these field equations are automatically 
consistent with (15.10.48), because the equations of motion from which (15.10.48) 
was derived were themselves derived from the field equations. With dissipation 
neglected, Eq. (15.10.48) simply tells us that 1^ has a time dependence inversely 
proportional to R 5 (p + p), 

U, oc — - — - (15.10.49) 

R 5 (p +p) 

which may be regarded as the relativistic generalization (in comoving coordinates) 
of the Newtonian result (15.9.22). 

Compressional Modes 

Once again, the richest time dependence is displayed by the compressional 
modes, in which the quantities Ji kk , q ■ U 1 , p 1 , T lt and n 1 are not constrained 
to vanish. Equations (15.10.31), (15.10.33), (15.10.34), and the divergence of Eq. 
(15.10.32) here provide a set of coupled equations for these quantities: 

K* ~ y + 2 Kk = -^OR\ Pl + 3 Pl ) (15.10.50) 

p, + “ <«>. +*>--«’ + rl g (|p) + iq • c,} 


(15.10.51) 
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- + 16nGri 
dt 


^( n A= -iq.U, - S -(S 

dt \ n ) dt V 2 R 2 


(15.10.52) 


*q • U !^ 5 {p + p - %T) + x R3 <q 2 T 


i ~ ( R \ • Ui! 
dt 


= 72 3 q 2 p t — t]B 3 q 2 


4i TT 2d 
- q • + - - 

3 3 dt 



(15.10.53) 


If we use the equation of state to express £q and p k in terms of and these 
can be regarded as four equations for the four unknowns h kk , q • Uj, n lt and T t . 
The reader may check that these equations yield the damping rate (15.8.30) for 
fluctuation wave numbers that are much larger than the Jeans limit, for which 
gravitation and the expansion of the universe may be neglected. Also, in the 
nonrelativistic limit with damping neglected, these equations reduce to the 
previously derived Newtonian equations (15.9.20) and (15.9.21), provided that we 
identify 


5 


Pi 

V 


e 


R_ 

I 2 


*"q ■ u i 


+ 1 — Mi 

2 dt \R 2 ) J 


For a thorough discussion of the normal modes described by Eqs. (15.10.50)- 
(15.10.53), the reader is directed to the review of Field. 157 For our present pur- 
poses, it will be sufficient to consider only the limiting case of very small wave 
numbers. In the limit q — > 0, all dissipative effects vanish; indeed, by eliminating 
h kk in Eqs. (15.10.51) and (15.10.52), we can show that the perturbations keep the 
entropy constant, so that 

Pi = v s 2 Pl (15.10.S4) 

Also, it is convenient to use Eq. (15.1.21) to write Eq. (15.10.51) as 

_ I £ h pa + Opi ) 

2 dt \R 2 ) p + { 1 p + P J 

= -L_ L - uuteA 

p + v ( p + v j 

— PM 

St \p + PJ 

The addition of a time-independent term to h kk jR 1 would correspond to a mere 
coordinate transformation of the form (15.10.36), so the solution is essentially 
unique, 

h kk = -2B 2 8 (15.10.55) 


with S now defined by 


Pi = (p + P)<5 


(15.10.56) 
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Using Eqs. (15.10.54)-(15.10.56) in (15.10.50) yields the second-order differential 
equation 

2 ft 

S + - 4tcG( p + p)( 1 + 3 0<5 = 0 (15.10.57) 

R 


We are now at last able to calculate the growth rate in “Phase A,” that is, in the 
early period when the Jeans mass is very small and the energy density is dominated 
by radiation (and neutrinos). In this case, we have 


R oc t 1/2 
and Eq. (15.10.57) reads 


3 p 

p ~ V ~ - 

Z2nGt 2 3 

S + rM - t~ 2 S = 0 


S + cc t 6_ oc t 1 


1 



(15.10.58) 


(15.10.59) 


Again, there is a growing solution <5 + and a decaying solution <5_ : 


but no exponential growth. 


11 The Very Early Universe 

The thermal history of the universe was traced in Section 15.6 back to an era 
when the temperature was about 10 12o K. At this early time, the universe was 
filled with particles — photons, leptons, and antileptons — whose interactions are 
hopefully weak enough to allow this medium to be treated as a more or less ideal 
gas. However, if we look back a little further, into the first 0.0001 sec of cosmic 
history when the temperature was above 10 12o K, we encounter theoretical prob- 
lems of a difficulty beyond the range of modern statistical mechanics. At such 
temperatures, there will be present in thermal equilibrium copious numbers of 
strongly interacting particles — mesons, baryons, and antibaryons — with a mean 
interparticle distance less than a typical Compton wavelength. These particles 
will be in a state of continual mutual interaction, and cannot reasonably be 
expected to obey any simple equation of state. 

However, the temptation to try to construct some sort of model of the very 
early universe is irresistible. There are in fact two extremely different simple 
models that have been widely considered in recent years, and that reflect two 
divergent views of the nature of the strongly interacting particles. Although 
neither model can be taken seriously in detail, the hope is that one or the other 
of these models may come close enough to reality to lead to useful insights about 
the very early universe. 

The first of these two pictures may be called the elementary particle model . 
It is supposed that all particles are made up of a small number of elementary 
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particles — say, photons, leptons, “quarks,” and their antiparticles. It is further 
supposed here that at very high temperatures the forces that bind the elementary 
particles become negligible, just as the neutron-proton force becomes cosmo- 
logically unimportant at temperatures above the dissociation temperature of 
deuterium. Let there be Jf different kinds of elementary particles, counting spin 
states and antiparticles separately, and counting fermions as | of a particle. 
(See Eq. (15.6.32). For instance, if we include only the familiar photons, leptons, 
and antileptons, plus three kinds of spin \ quarks and antiquarks, we have 
JT — 26.) Then for kT above the mass of the heaviest elementary particle, the 
contents of the universe will behave as if they consisted of dTj 2 different kinds of 
black-body radiation, with pressure, energy density, and specific entropy given by 


3p ~ p ~ \JTaT 4 


(15.11.1) 


(p + p) = 2 «T 3 
n R kT 3 njc 


(15.11.2) 


where n B is the number density of baryons minus antibaryons. (The extra factor 
\ enters here to cancel the factor 2 in the Stefan-Boltzmann constant arising from 
the two photon polarization states.) For an adiabatic expansion <7 is constant, 
and since <7 at present has the value (15.5.18), the temperature in the very early 
universe is given by 


T 

To 



(15.11.3) 


The relation between the energy density p and the time t here is the same as 
(15.6.44), and so 



(15.11.4) 


In contrast, in a composite particle model , it is supposed that there are no true 
elementary strongly interacting particles but instead that all such “hadrons” 
must be regarded as composites of one another. We then face a question of prin- 
ciple, whether thermodynamic calculations can be carried out including as 
“particles” only the one absolutely stable hadron, the proton, or also slowly 
decaying hadrons, such as the neutron and pi-mesons, or perhaps all resonant 
hadron states, including such rapidly decaying resonances as the rho-meson and 
the “3-3” n-N resonance. It is an attractive conjecture, that if all resonances were 
included in our thermodynamic calculations, then to a first approximation, it 
might not be necessary to take any further account of the particles’ mutual 
interactions. (See Section 11.9.) If so, then the contents of the early universe 
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could be treated as consisting of a great many ideal gases, with Jf{m) dm types 
between mass m and m + dm. But what is the function J/~(m) ? The greatest 
possible contrast with the elementary-particle model is achieved for a distribution 
that grows as fast as possible, that is, 

( 771 \ 

) for m —> co (15.11.5) 

where A, B, and T M are unknown constants. Thermodynamic quantities will 
generally involve integrals of Jftyri) dm , with weighting factors that behave like 
e -m/kT f or m oo, so these quantities would not converge for a distribution 
function that grew faster than (15.11.5), and, even for the distribution function 
(15.11.5), would not converge for T > T M . An ideal-gas model with a number of 
species given by Eq. (15.11.5) is thus characterized by a maximum temperature T u . 
The analysis of secondary particle emission in very high-energy reactions 158 and 
the recent Veneziano model of hadron interactions 1 5 9 both independently suggest 
a number of hadron species given by Eq. (15.11.5), with B of order 2 to 4, and with 
T m of order 1.7 x 10 12o K. Leaving aside mesons, leptons, and photons for the 
moment, the total energy density, pressure, and baryon number density are given 
by the usual Fermi distribution as 


p — h 


V = 1 


jV(m)dm j* {[e <£ “' ,)/ ' tT + 1] _1 + [e (£+,,)t7 ' + 

“d 


jM(m) dm 


|| -jE-ri/kT + 1]-1 

+ [e {E+ti)/kT + l]- 1 }#-^ 2 d 3 p 


= h~ 3 JT(m) dm J {[e^ E ~^ kT + 1] _1 — [ e ^ E+ ^ kT + l]" 1 } d 3 p 


(15.11.6) 


(15.11.7) 

(15.11.8) 


where E is the particle energy (p 2 + m 2 ) 1/2 , and p is the chemical potential 
associated with baryon number. The dimensionless entropy per baryon a is 
defined by the second law of thermodynamics as the integral of the perfect 
differential 


1 

IT 


+ pd[- 

n 


and a straightforward integration gives 

= (P + P - V n ) 
nkT 


(15.11.9) 


In an adiabatic expansion, p and p drop from presumably infinite initial values, 
while a must stay constant. The only way that p andp can approach infinity whil- 
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a remains fixed is for p to become infinite while T approaches a finite value T j 
less than T M . In this limit, the integrals (15.1 1.6)— (15.11.8) approach the values 160 


P 



p -> A'e tl/kTM p 5/2 B kT x esc 





(15.11.10) 

(15.11.11) 


lin -» A'e“ lkT ^n 5l2 -- B kT l esc ( ^ 

\ T m 


X 



(B — f ) cot 


TtT i 

t m 



(15.11.12) 


where A f = ( JcT M ) 3f2 h 3 {Sn) 1/2 A. The entropy (15.11.9) then takes the value 


a 


T m 
T i 


— n cot 


nT l 

T M 


(15.11.13) 


Since cr is very large, the initial temperature T t is very close to the maximum 
temperature T u \ 



(15.11.14) 


With a finite initial temperature, the energy density and pressure of mesons, 
leptons, and photons is negligible in the limit t > 0 in comparison with the 
baryonic contributions (15.11.10) and (15.11.11), justifying the neglect of all 
particles but baryons in the above calculations. The baryon number density must 
vary as R~ 3 , so Eqs. (15.11.12) and (15.11.10) give for R -> 0 


H -► UT M \lnR\ (15.11.15) 

and 

p -► an oc f^“ 3 |ln i2| (15.11.16) 

The Einstein field equation (15.1.20) then has a solution (with Jc neglected) of the 
form 160 

R oc £ 2/3 lln£| 1/2 (15.11 17) 

in contrast with the behavior R oc t 1/2 expected in an elementary- particle model. 

How can we distinguish among models of the very early universe ? As pointed 
out in Section 15.6, most of the constituents of the universe were in thermal 
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equilibrium at temperatures above 10 12o K, so that the present contents of the 
universe for the most part depend only on the entropy per baryon, and perhaps 
on the ratio of the lepton numbers to the baryon number, in the hot early universe. 
In order to learn something about the behavior of the universe before the tem- 
perature dropped to 10 12o K, we need to look for fossils, particles that might have 
escaped thermal equilibrium before the temperature dropped below 10 12o K. 

One such possible fossil particle is the quark, the hypothetical fundamental 
particle of the strong interactions. Zeldovich 161 has estimated, on the basis of 
what we have here called an elementary-particle model, that if quarks really can 
exist as free particles, the density of leftover quarks, which escaped fusion into 
nucleons in the early universe, would be about the same as the observed present 
density of gold atoms. Efforts to find quarks in nature have so far failed, so one 
can conclude either that free quarks do not exist, or that the thermal history of 
the very early universe was very different from (15.11.4). 

One other, less hypothetical, fossil particle is the graviton. As shown by Eq. 
(15.10.42), a graviton in an imperfect fluid with shear viscosity rj will have a mean 
free time 1 56 

x g = (167T^) _1 (15.11.18) 

If z g is not much longer than the expansion time t, then the transport of momentum 
by gravitons will give the medium a viscosity 

rj = -t§aT\ g (15.11.19) 

Eliminating tj from these two equations then gives 149 

z g = (¥tnGaT*)~ l t 2 (15.11.20) 


In an elementary-particle model, Eqs. (15.11.20) and (15.11.4) give the ratio of 
the graviton mean free time to the expansion time as 


H¥T 


(15.11.21) 


If JC is not too large, z g will be not much greater than t, so that (15.11.19) and 
(15.11.20) will be roughly correct, and z g will thus vanish for t -* 0 like t. In an 
elementary-particle model, then, the “optical depth” J z g ~ l dt of the very early 
universe for gravitational radiation diverges logarithmically for t -> 0, and the 
present universe should thus contain left-over black-body gravitational radi- 
ation, 162 with a temperature 


_ Who 
R q 



(15.11.22) 


For instance, if JC = 26 and T y0 = 2.7°K, then the present gravitational radi- 
ation background temperature is about 0.9°K. In contrast, in a composite-particle 
model, the product RT vanishes for t -> 0, so even if the universe is “optically 
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thick” to gravitational radiation, the present graviton background temperature 
would be very much less than the value (15.11.22). Thus the presence or the 
absence of a gravitational radiation background with temperature of order 1°K 
would provide clear evidence for the behavior of matter in the very early universe. 
Unfortunately, there does not seem to be any way to detect a 1°K gravitational 
radiation background directly. 162 Its most important effect would be to shorten 
somewhat the expansion time scale during the radiation- dominated era, very 
slightly increasing the cosmic production of helium. 

It may be that the heat represented by the huge entropy per baryon in the 
microwave background provides the most useful clue to the very early history of 
the universe. Of course, it is possible that this heat was put in at the initial 
singularity, in which case we must regard a as a dimensionless fundamental 
constant, like the fine structure constant. However, it is more attractive to 
suppose that the present entropy per baryon was generated by physical dissipative 
processes acting in the early, or the very early, universe. 

One such nonadiabatic mechanism for entropy generation is provided by the 
phenomenon of bulk viscosity. We saw in Section 15.10 that the shear viscosity 
and heat conduction can play no role in a Robertson- Walker model. The only 
dissipative effect that can enter in the energy- momentum tensor for an isotropic 
homogeneous expansion is the term in Eq. (2.11.21) proportional to the bulk 
viscosity, £, which in general coordinates takes the form 

A = -C(/ v + IW v )U a . a 

where U ** is the fluid velocity four- vector. In a Robertson-Walker model, we have 
U k . k — 3 R/R, so the total energy-momentum tensor here is 

T“ v = pU^U'- + - K + tW v ) 

The whole effect of the bulk viscosity is thus to replace the pressure p with 

D 

p* = f _ 3£_ (15.11.23) 

R 

The bulk viscosity therefore has no effect in the formula, Eq. (15.1.20), for R in 
terms of p. However, it does appear in the energy- conservation equation, which 
now, in place of (15.1.21), reads 

— (pi? 3 ) = -3 p*R 2 = -3 pR 2 + 9 CRB (15.11.24) 

dR 

Since n oc R~ 3 the specific entropy will in general increase at a rate 
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and (15.11.24) gives this as 149 


9 tjR 2 
nkTR 2 


(15.11.25) 


For instance, for a fluid consisting of material particles with a very short mean 
free time plus photons with mean free time r, the bulk viscosity is 149 


c 



and Eq. (15.11.25) gives an entropy production rate 



(15.11.26) 


(15.11.27) 


(For neutrinos, just multiply by a factor of J.) The production of entropy here 
can be understood as due to the fact that the frequency of a free photon will vary 
as ljR between collisions, whereas the temperature of the material medium will 
not vary as ljR unless (dp/dp)„ takes the value j, so that the expansion of the 
universe is continually pulling radiation and matter out of thermal equilibrium 
with each other. 163 However, in an elementary- particle model ( dpldp) n will be 
close to ^ in the very early universe, whereas in a composite-particle model, the 
photons (or neutrinos) have only a small share of the total entropy, so that the 
last factor in (15.11.27) is small. Estimates of d/cr do not indicate that bulk viscosity 
can account for the high entropy of the present universe. 149 

If the present entropy of the universe is not due to bulk viscosity, then 
perhaps it is produced by the effects of shear viscosity or heat conduction in an 
initially anisotropic or inhomogeneous expansion. Indeed, it may be just these 
dissipative processes that are responsible for smoothing out initial anisotropies, 
and hence producing the high degree of isotropy observed in the cosmic microwave 
radiation background. Misner 164 has shown that neutrino viscosity, acting before 
the temperature dropped to 2 x 10 10o K, would have reduced the present anisot- 
ropy of the black-body radiation, produced in an initially homogeneous but 
anisotropic expansion, to less than 0.03%. (However, see pp. 525-6.) 

Another possible explanation for the observed high entropy per baryon is 
that the mean baryon number density really vanishes, as in the theories of 
Klein 165 and Alfven. 166 When (and if) the temperature was above 10 13o K, the 
number density of nucleons plus antinucleons would have been of order 

a T ^ 

n N + n N ~ ~ G n B ~ G ( n N ~~ n N ) 


so if o' has remained constant, the fractional excess of nucleons over antinucleons 
in the very early universe would have had the very small value 

nN ~ -* Hr 8 to icr 9 

n N + n-$ a 
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In the symmetric cosmology of Klein and Alfven this small nucleon excess is 
interpreted as a purely local phenomenon — it is supposed that in other parts of 
the universe there was a small antinucleon excess, giving rise at present to galaxies 
of antimatter. The detailed calculations of Omnes 166a show that reasonable 
physical processes in a symmetric plasma of matter and antimatter could have 
produced the required small separation of matter and antimatter. Unfortunately, 
observational astronomy has not yet provided any definite information as to 
whether distant galaxies consist of matter or antimatter. Only the gamma-ray 
spectrum provides a hint 1666 that antimatter may exist on a cosmological scale. 

As long as we have the temerity to speculate about the very early universe, 
we may as well carry our speculations back to the very beginning. The Friedmann 
solutions, taken literally, indicate that R vanishes as t — ► 0, as t 1/2 in the elemen- 
tary-particle models and as t 2/3 1 In t \ 1/2 in a composite-particle model. For all we 
know, this singularity actually occurs, but it is natural to wonder whether it can 
be avoided. 

One way to avoid a singularity in the very early universe is for the energy 
density p to vanish, owing perhaps to some very short-range attractive force that 
overbalances the particles’ rest masses. If p vanishes at some critical value R c of 
the scale factor R(t), then R vanishes at R c (or, for finite curvature, near R c ) so 
that R might have decreased to R c before beginning its present increasing phase. 

Even if the energy density is always positive, it is still conceivable that the 
universe could escape a general singularity, through anisotropies or inhomo- 
geneities that invalidate the simple Friedmann solutions. Penrose 167 and Hawk- 
ing 168 have proved a number of powerful theorems that show that a singularity 
is inescapable under very general conditions. For instance, one of Hawking’s 
theorems states that a singularity is unavoidable, provided that general relativity 
is valid, that each point of space-time has a small neighborhood that no curve 
with timelike or null tangents cuts more than once, that the energy- momentum 
tensor satisfies the positivity condition 

[2V, - to'jr'JW'W > o 

for all vectors W M with W^W ^ < 0, and that there is a point p such that all the 
past-directed timelike geodesics through p start converging again within a compact 
region in the past of p. The last condition is satisfied if there is enough matter to 
make the world lines through p converge in the past, and Hawking and Ellis 169 
have shown that the cosmic microwave background does provide enough energy 
density in the past to satisfy this condition. However, it is important to note that 
the Penrose-Hawking theorems do not say that there is a singularity in the past 
that involves all space, as in the Friedmann solutions, but only that there is some 
singularity somewhere. The singularity might consist merely of one or more isolated 
points, which behave like time-reversed collapsing stars. 

Finally, it may be that classical general relativity itself breaks down in the 
very early universe. One simple way for this to happen is through the effects of a 
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cosmological constant, to be discussed in the next chapter. A more intriguing 
idea is that quantum effects might become important, invalidating any purely 
classical field theory of gravitation. For a system, consisting of point particles 
with a mean particle energy E, the relative importance of gravitational “radiative 
corrections” will be described by a “gravitational fine structure constant” 


a. 


GE 2 


analogous to the usual electromagnetic fine structure constant 33 - 7 . Quantum 
effects will become important when ct g is of order unity, or when E reaches a 
critical value, given in c.g.s. units by 

/* 5 \ 1/2 

E c = j = 1.22 x 10 28 eV (15.11.28) 


corresponding to a temperature 1.4 x 10 32 o K. The temperature in a composite 
particle model never gets anywhere near so high, but kT would have been greater 
than E c at the very beginning of a Friedmann universe composed of a finite 
number Jf of species of elementary particles. In fact, a great many other quantum 
effects become important at that time . 170 For instance, the oscillation rate of a 
typical particle wave function at temperature T is kT/h, while the expansion rate 
of the universe at this time is given by Eq. (15.11.4) as 

B _ i U7iGjr a T A \ 1/2 

E~ 2t~ \ 3 ) 

Recalling that a = 7c 2 & 4 /15ft 3 , the ratio of these rates is 

kTJh _ / 4 5ft V /2 _ / _45 \ 1/2 (E\ 

E/E ~ \4n 3 *VG(kT) 2 ) ~ \4n 3 lr) \ kT) 

Hence for temperatures above the critical temperature E c jk , the wave functions 
of typical particles have an oscillation rate slower than the expansion rate of the 
universe, so that no classical or semiclassical description can be applied to the 
particles in thermal equilibrium at that time. 

The consideration of first things naturally leads to speculation about last 
things . 171 We saw in Section 15.1 that a Friedmann universe with k = +1 will 
eventually cease its expansion and begin to contract. Taken literally, such models 
require that a singularity with R = 0 will be reached at some finite time in the 
future, of order 75 x 10 9 years for H 0 = 75 km sec - * 1 Mpc ” 1 and q 0 = 1 . 
However, if it is possible to escape a general singularity in the past, either through 
negative energy densities, anisotropies, or quantum effects, then presumably it 
ought to be possible to escape the general singularity in our future. In this case, we 
might suppose that the universe undergoes an oscillation, with periods of con- 
traction and expansion succeeding one another eternally. 
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Could this oscillation be periodic ? That is, can we restore a steady state 
picture of the universe, by viewing cosmic history on a sufficiently grand time 
scale ? One obvious objection is that entropy is presumably created, not destroyed, 
in each cycle. It has been suggested that entropy might be destroyed during 
contracting phases, 1 72 because it is the expansion of the universe that, by provid- 
ing a heat sink, sets the direction of time’s arrow in thermodynamic processes. 
However, there is no detailed model that describes how this can come about. In 
particular, it is hard to see how time’s arrow could be reversed just at the moment 
when R{t) reaches its maximum value, at which time the background radiation 
temperature is so low, of order 1°K, that it can hardly affect terrestrial processes. 
If somehow or other the second law of thermodynamics can be evaded, and the 
universe really does expand and contract periodically, then any particles that are 
not brought into thermal equilibrium during the contracted phase, such (perhaps) 
as gravitons or neutrinos, would have to be present in large numbers : If N particles 
are produced in a given comoving volume during each cycle, and the probability 
that one of these particles is absorbed in one cycle is P, then there would have to 
be a mean number N jP of particles in this volume in order to maintain a more or 
less constant population. It is therefore not out of the question that some day we 
may detect remnants of previous cycles of the history of the universe. For the 
present, however, such matters remain at the furthest bounds of cosmological 
speculation. 
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“And as for certain truth, 
no man has seen it, nor will 
there ever be a man who 
knows about the gods and 
about all the things I 
mention. For if he succeeds 
to the full in saying what is 
completely true, he himself 
is unaware of it ; and 
Opinion is fixed by fate upon 
all things.” Xenophanes of 
Colophon 

16 COSMOLOGY: 
OTHER MODELS 


The big- bang Friedmann model discussed in the last chapter does not at any 
point come into clear conflict with observation. However, it also cannot yet be 
said to have been definitely confirmed by observation. Therefore in this chapter we 
shall take a brief look at some of the other cosmological models that still compete 
with the “standard" theory. 


1 Naive Models : The Olbers Paradox 

Throughout the eighteenth and nineteenth centuries, perhaps a majority of 
astronomers would have subscribed to a simple cosmological picture, in which the 
universe is supposed to be infinite, eternal, and Euclidean, and the stars are more 
or less at rest, with constant average luminosity per unit volume. Such naive 
models would seem to be ruled out by the discovery of the general red shift of 
distant galaxies, but it is still of some interest to note an argument against the naive 
cosmologies, which was offered in 1744 by the Swiss astronomer J. P. L. de 
Cheseaux, 1 and, independently in 1826, by Heinrich Wilhelm Matthias Olbers 2 
(1758-1840). Their argument was based on the most ancient of all astronomical 
observations, that the sky grows dark when the sun goes down. 

To see the significance of this observation, note that if absorption is neglected, 
the apparent luminosity of a star of absolute luminosity L at a distance r in a 
naive cosmological model will be Lj^nr 2 . If the number density of such stars is a 
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constant n, then the number of stars at distances between r and r + dr is 4 nnr 2 dr, 
so the total radiant energy density due to all stars is 


Ps = 


*00 

0 


L 

4nr 2 


4nnr 2 dr = Ln 



(16.1.1) 


The integral diverges, leading to an infinite energy density of starlight! 

In order to avoid this paradox, both de Cheseaux and Others postulated the 
existence of an interstellar medium that absorbs the light from the very distant 
stars responsible for the divergence of the integral (16.1.1). However, this resolution 
of the paradox is unsatisfactory, 3 because in an eternal universe the temperature 
of the interstellar medium would have to rise until the medium was in thermal 
equilibrium with the starlight, in which case it would be emitting as much energy 
as it absorbs, and hence could not reduce the average radiant energy density. 
The stars themselves are of course opaque, and totally block out the light from 
sufficiently distant sources, but if this is the resolution of the Olbers paradox, then 
every line of sight must terminate at the surface of a star, so the whole sky should 
have a temperature equal to that at the surface of a typical star. 

To see how modern cosmological models avoid the Olbers paradox, we note 
that according to Eq. (14.4.12), the apparent luminosity of a star of absolute 
luminosity I at a comoving coordinate r 1 is (now neglecting absorption) 

l = - m 2 (y 

MzR 4 (t 0 ) r i 2 

where t 0 is the time the star is observed and t t is the time the light was emitted. 
Also, according to Eq. (14.7.4), the number of stars of luminosity between L and 
L + dL , whose light, observed at time t 0 , was emitted between times t t — dt t 
and dt x , is 

dN = 47r-ft 2 (£ 1 )r 1 2 %(£ 1 , L) dt 1 dL 


where n(t 1 , L) dL is the number density of stars at time t x with luminosity between 
L and L -f dL . The total energy density of starlight is therefore 


PsO ~ 


l dN 


rt 0 


&(h) 


m Q )_ 


(16.1.2) 


where is the proper luminosity density 


&(h) 


% 


n(t lf L)L dL 


In a “big-bang” cosmology, there is obviously no paradox, since the integral 
(16.1.2) is effectively cut off at a lower limit t l = 0, and the integrand vanishes at 
t ± = 0, roughly like i^). The question of an Olbers paradox arises only in models, 
such as the steady state cosmology, in which the universe is supposed to have 
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existed for an infinitely long time. In such models, a necessary condition for 
avoidance of the Olbers paradox is that 

-> 0 for ^ — 00 (16.1.3) 

For neutrinos there is a slightly stronger condition , 4 with i? 3 (^) in place of i? 4 (^ 1 ), 
because one of the factors of R{t-T)IR{t 0 ) in Eq. (16.1.2) arose from the loss of 
energy by individual red-shifted photons, and for neutrinos the number density as 
well as the energy density is in principle observable. The only popular cosmology 
in which (16.1.3) is not satisfied is the oscillating model discussed in Section 15.11. 
In this case, absorption is needed to avoid an Olbers paradox, but the absorption 
occurs during the highly contracted era, and the red shift during the subsequent 
expansion saves us from an intolerably bright night sky. From this point of view, 
the 2.7°K microwave background appears as the pale image of the fiery furnace 
with which we were threatened by de Cheseaux and Olbers. 


2 Models with a Cosmological Constant 

When Einstein formulated the general theory of relativity in 1916, the universe 
was generally believed to be static. According to Eqs. (15.1.18) and (15.1.19), the 
scale factor R(t) can only be constant if 

p = —3 p = 3kl8nGR 2 

However, this requires that either the energy density p or the pressure jp should be 
negative. In order to avoid this unphysical result, Einstein in 1917 modified his 
equations to read 5 

- K* = (16.2.1) 

where X is a new fundamental constant, the so-called cosmological constant. 

We have already noted at the end of Section 7.1 that Eq. (16.2.1) is the most 
general modification of Einstein’s equations that preserves the feature that T Mv 
is set equal to a tensor that is constructed from g /lv and its first and second deriv- 
atives, and is linear in the second derivatives of g . However, for our present 
purposes it is more convenient to move Xg flv to the right-hand side of the equations, 
writing 

- ig„yR% = - 87 zGf^ (16.2.2) 

where is a modified energy-momentum tensor: 

ST *'~-£g "<■’ (16 ' 2 - 3) 

If T^ v has the perfect-fluid form (15.1.12), theij. so does 

= P9„V + (p + p)V„U, 


(16.2.4) 
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with a modified density and pressure 


P = P - 


/l 

SnG 


P = P 



(16.2.5) 


All of the results obtained in Section 15.1 still apply to theories with a cosmological 
constant, provided that we replace the quantities p and p with the modified 
density and pressure (16.2.5). 

In particular, the conditions for a static universe now read 


P = 


— 3 p = 


3k 


8nGR' 


For a 11 'dust’ ’-filled universe with p = 0, this gives 


A 

R 2 


X 


X 

4:71 G 


(16.2.6) 


(16.2.7) 

(16.2.8) 


In order to have p positive, Eq. (16.2.8) requires that X should be positive, and Eq. 
(16.2.7) then tells us that 

Jfc = +1 (16.2.9) 

and 

R = (16.2.10) 

P 


The static Einstein universe is therefore finite (though of course unbounded), with 
a positive curvature and a density that are fixed by the fundamental constants /. 
and G. 

Of course, the discovery during the 1920’s of a systematic relation between 
red shift and distance removed any interest in the static Einstein universe as a 
realistic cosmological model. Nevertheless, the existence of a cosmological constant 
remains a logical possibility, and cosmologists have thoroughly explored the 
dynamics of expanding universes with a cosmological constant. 6 We shall restrict 
our attention here to models with zero pressure, so that Eq. (15.1.21) gives pR~ 
constant. It is convenient to express this constant in terms of the value it would 
have in a static Einstein model : 

pR 3 = (16.2.11 

4jt(?V|A| 


The dynamical equation (15.1.20), with p replaced by the modified density p 
defined in Eq. (16.2.5), now reads 


R 2 


1 (X R 3 

R { 3 


2a I 


3VlA|j 


— kR + 


(16.2.12) 
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The qualitative behavior of R(t ) depends on the pattern of zeros, maxima, and 
minima of the cubic on the right-hand side. There are three special cases of par- 
ticular interest, associated with the names of de Sitter, Lemaitre, and Eddington 
and Lemaitre. 

In the de Sitter model, 7 space is essentially empty and flat, so that Jc ~ a = 0, 
and X is positive. Equation (16.2.12) then has the simple solution 

R oc e Ht (16.2.13} 

H = (16.2.14) 

The metric here is the same as in the steady state model discussed in Section 14.8, 
with the difference that instead of matter being continuously created, there is no 
matter at all! As discussed in Section 13.3, this metric has a ten-parameter group 
of isometries, which is just the group of ‘‘rotations” in five-dimensions that leave 
invariant a diagonal matrix with elements +1, +1, +1, +1, —1. This group is 
therefore often called the de Sitter group. Although the absence of matter in the 
de Sitter model removes it from consideration as a serious model of the universe, 
it should be noted that any model with X > 0 goes over to a de Sitter model for 
R -> oo. 

In what is known as the Lemaitre model, 8 space is positively curved, X is 
positive, and more matter is present than in a static Einstein model, so that 
h = +1, and a > 1. According to Eq. (16.2.12), the scale factor R starts expand- 
ing at t = 0 like t 2/3 , but the expansion then slows down, reaching a minimum 
rate at R — a 1/3 /V>L> after which it speeds up again, ultimately approaching the 
de Sitter result (16.2.13). The most remarkable feature of this model is the 
existence of a “coasting period” during which R(t) remains close to the value 
R = a 1/3 / V/i at which R has its minimum. During this period, the differential 
equation (16.2.12) with Jc = 4- 1 takes the approximate form 

B 1 ^ a 2 ' 3 - 1 + (y/XR - a 1 ' 3 ) 2 

The solution is 

R = [1 + (1 - a~ 213 ) 112 sinh (VI (t - t m ))] 

■Jx 

where t m is the time at which R reaches its minimum. If a is very close to unity, 
then R will remain close to the static Einstein value for a long time, of order 

At = ;r 1/2 |ln (1 - ol~ 2/3 )| (16.2.15) 

The Eddington-Lemaitre model is a limiting case of the Lemaitre models, 
given particular prominence through the work of Eddington. 9 It has the same 
curvature and mass as the static Einstein model; that is, Jc = +1 and a = 1, 
and behaves like a Lemaitre model with an infinitely long “coasting period.” 
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Thus, if we start with R = 0 at t = 0, then R asymptotically approaches the 
Einstein value \j\!x for t -* oo. On the other hand, if we start with R = l/yjl 
for t = 0, then R grows monotonically, ultimately approaching the de Sitter 
exponential growth (16.2.13). This incidentally shows that the Einstein model is 
unstable, because if it is subjected to an infinitesimal expansion or contraction, 
then R must go on expanding or contracting, with a time dependence given by the 
Eddington-Lemaitre model. 

The observed concentration of the red shifts of quasi-stellar objects around 
2 ~ 2 (see Sections 11.6, 14.8) has revived interest in the Lemaitre models, 10 
since it suggests that an unusually large number of QSO’s were present at a 
particular value of the scale factor, R ~ R Q j 3, as would be expected in a Lemaitre 
model for which the “coasting” radius a 1/3 /V/i had this particular value. By 
taking a close to unity, we can make the “coasting period” as long as we want, so 
that the predominance of the particular red shift z ~ 2 can be made as pronounced 
as may be needed to account for the QSO observations. With this new motivation, 
there have recently been carried out studies of the propagation of light signals 
around the universe, 11 of the radio number counts, 12 and of the formation of 
galaxies, 13 in Lemaitre models. There is no definite evidence against the 
Lemaitre models, but they seem an artificial way to account for what may be 
merely a detailed feature of the evolution of the quasi-stellar objects. 


3 The Steady State Model Revisited 

If the universe is not only isotropic and homogeneous in space, but also 
homogeneous in time, then, as shown in Section 14.8, its metric must have the 
Robertson -Walker form, with 

k = 0 R{t) oc e Ht (16.3.1) 

where H is the Hubble constant, here truly a constant of nature. Also, all scalars, 
such as p and p, must be time- as well as position-independent : 

p = p = 0 (16.3.2) 

The field equations underlying the steady state model were left unspecified in 
Section 14.8, but it is clear that Einstein’s theory would need to be modified to 
be used here. The Einstein field equations are only consistent with the Bianchi 
identities if the energy-momentum tensor is conserved, but a constant pressure 
would violate the energy- conservation equation (14.2.19) unless p = —p, which 
would require either the energy-density or the pressure to be negative. 

It is therefore necessary to modify the Einstein equations by adding a cor- 
rection term 14 G^\ 


+ <?„ v = -SxOT„ 


(16.3.3) 
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A straightforward calculation, using Eq. (16.3.1) in (15.1.6), (15.1.7), and (15.1.11), 
gives 

~ = 3H 2 g flv 

so the form of the correction term required by the steady state model is 

0 MV = + 3# 2 )g MV - SnG{p + p)U fl U v (16.3.4) 

where is the velocity four- vector, with U x = 1 and U l — 0. 

In order to learn anything from Eq. (16.3.4), we need to impose some a priori 
ideas as to the form of the tensor (7 MV , Hoyle 14 suggests that in general 

Op . = 0;m;v (16.3.5) 

where C is a scalar, called the C -field. Hoyle further suggests that in the absence of 
all inhomogeneities or anisotropies, C is simply proportional to the cosmic time 
coordinate used in the Robertson- Walker coordinate system : 


G = At A constant (16.3.6) 

It is easy to calculate the second covariant derivative : 

= -AH^, + UJJ,) (16.3.7) 

Comparison of (16.3.7) with (16.3.4) shows that the density must take the value 

3 H 2 


p = 


8 nG 


(16.3.8) 


and the constant of proportionality in C is 


A = 


8nG(p + p) 

w 


(16.3.9) 


The pressure can take any value. 

The predicted density (16.3.8) is the same as given by a Friedmann model 
with vanishing curvature. [See Eq. (15.2.1).] Thus the verification of Eq. (16.3.8) 
would not really serve to confirm the steady state model. Also, the steady state 
cosmology does not require that the G tensor take the form (16.3.5), (16.3.6), 
so we would not be forced to give up the steady state metric if the density were 
found to be different from (16.3.8). 

The clearest evidence against the steady state model is the observed cosmic 
microwave background, which seems to be a remnant of a quite different earlier 
stage of the universe. (See Section 15.5.) However, it is not out of the question for 
a microwave background to be created along with the baryons in a steady state 
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model. According to Eq, (15.4.9), the number density of photons per unit frequency 
interval in a steady state model is 


n (v) = 8nv‘ 


exp - 


{A(ve H(r °- J '>) - Q(ve H ^ to ~ t '^)} dt' 


x Q(ve H ^° dt 


where A(v) is the absorption rate of a photon of frequency v, and 87 tv 2 Q(v) d\ 
tlie emission rate per unit volume of photons between frequency v and v -r d\ 
By a simple change of variables this can be written in the ^-independent form 


n Y (v) = 87tv 2 


*°° dv ' rv 

Q(v ) exp 

v Hv' 


’ v ' fly" \ 

-[A(v")-fi(v")] (16.3.10) 

v Hv I 


Differentiating with respect to v yields a differential equation for n y (v), which 
can also be written as a formula for f2(v) in terms of A(v), n y (v) and n y (v ): 


= [A(v) + 2H ]n y (v) - Hvn y (v) 
Snv 2 + n y (v) 


(16.3.11) 


Thus, by a suitable choice of the photon emission rate, we can arrange to get any 
background distribution function n y {v) we want. For instance, if we demand the 
observed Rayleigh-Jeans low-frequency behavior n y (v) oc v [see Eq. (15.5.19)] 
then (16.3.11) yields in the limit v — > 0: 


0(0) = A(0) + H 


(16.3.12) 


The term H represents a purely cosmological continuous creation of photons, 
unrelated to any absorption processes. We can also obtain a Planck distribution 
function 


tty(V) 


8kv 2 

[exp (hv/kT) — 1] 


by choosing Q(v) as 


Q(v) = e~ hvfkT A(v) + 


Hhv/kT 

[exp (hv/kT) - 1] 


(16.3.13) 


The first term represents simply the usual emission processes that will always 
accompany any absorption [compare Eq. (15.4.7)], while the second term represents 
a continuous creation of photons. However, there is no a priori reason why the 
rate of continuous creation of photons should have the particular frequency 
dependence shown in Eq. (16.3.13), so from the standpoint of a steady state model, 
the Planck distribution law is possible but quite artificial. Indeed, there is no 
particular reason why low-frequency photons should be continuously created at 
precisely the rate 8nHv 2 dv required by Eq. (16.3.12), so even the Rayleigh-Jeans 
low-frequency behavior is somewhat unnatural. 
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Some support for the steady state model comes from quite a different quarter. 
From time to time attempts have been made to formulate electrodynamics and 
other field theories in terms of a direct action at a distance. 15 Such attempts 
generally foundered, because the electromagnetic effects of charged particles were 
found to correspond to an equal mixture of advanced and retarded solutions of the 
Maxwell equations, rather than the usual retarded solution. In 1945 Wheeler and 
Feynman 16 showed that this difficulty could be overcome by taking into account 
the electromagnetic interaction at a distance of the accelerated and test charges 
with all the other charges in the universe. However, they considered a static 
cosmological model, and so could obtain a net electromagnetic interaction cor- 
responding to either a pure retarded or a pure advanced solution. Hogarth 17 later 
suggested that this ambiguity could be removed by considering a more realistic 
model, taking into account the expansion of the universe. According to Hoyle and 
Narlikar, 18 only a purely retarded solution is possible for a steady state model, 
while only a purely advanced solution is possible for a Friedmann model with 
h < 0. Hoyle and Narlikar subsequently extended these considerations to the 
(7-field, 19 to the theory of gravitation, 20 and to quantum electrodynamics. 21 
This line of development certainly represents an intriguing approach to the old 
problem of relating the physics of the microcosm to the properties of the universe 
as a whole, a problem to which we shall return in the next section. However, it is 
too early to conclude that a steady state universe is in any sense required by 
considerations of microphysics, because there is no reason to suppose that electro- 
dynamics and other field theories need to be formulated in terms of an action at a 
distance. 


4 Models with a Varying Constant of Gravitation 

Gravitational forces are remarkably weak by the standards of atomic or 
nuclear physics. For instance, the ratio of the gravitational to the electric force 
between the electron and the proton has the value 

= 4.4 x lO' 40 (16.4.1) 

e 2 

Despite many attempts , 11 there have been no convincing explanations of why 
such a tiny dimensionless number should appear in the fundamental laws of 
physics. However, there is one clue, which suggests that numbers like (16.4.1) 
are not determined solely by considerations of microphysics, but in part by the 
influence of the whole universe. This clue is simply the fact that, from the quantities 
G,h, c, and the Hubble constant H 0 , it is possible to construct a mass, which is not 
too different from the mass of a typical elementary particle, such as the pion : 

CW 


m, 


(16.4.2) 
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(For H 0 ~ l = 10 10 years, the left-hand side has the value 60 MeV/c 2 , while the 
pion mass is 140 MeV/c 2 . If e 2 /c were used in place of ft, the left-hand side of 
(16.4.2) would become of the same order of magnitude as the electron mass.) Of 
course, one is perfectly free to regard (16.4.2) as a meaningless numerical coin- 
cidence, but it should be noted that the particular combination of ft, H 0 , G, and c 
appearing in (16.4.2) is very much closer to a typical elementary particle mass 
than other random combinations of these quantities; for instance, from h, G, and 
c alone one can form a single quantity ( hcjG) 1! 2 with the dimensions of a mass, but 
this has the value 1.22 x 10 22 MeV/c 2 , more than a typical particle mass by 
about 20 orders of magnitude ! 

In considering the possible interpretations of Eq. (16.4.2), one should be 
careful to distinguish it from other numerical “coincidences” such as the rough 
relation among 6r, H 0 , m py and the present cosmic baryon number density n 0 : 

Gn 0 m p « H 0 2 (16.4.3) 

This is a relation between two cosmological parameters, n Q and H 0 , and is in fact 
required by various cosmological models, such as the Friedmann models (unless 
q 0 1 or q 0 > 1) and Hoyle’s version of the steady state model. [See Eqs. 
(15.2.6) and (16.3.8).] In contrast, Eq. (16.4.2) relates a single cosmological 
parameter, H 0 , to the fundamental constants h, G, c and m n , and is so far un- 
explained. 

There are other numerical coincidences that have been noticed from time to 
time, but most of these are combinations of (16.4.2) and (16.4.3), sometimes with 
e 2 /c = 137ft in place of ft and with other masses in place of m n . For instance, it 
has often been remarked that the ratio of the atomic time unit e 2 /m e c 3 to the 
Hubble time H 0 ~ 1 is of the same order of magnitude, about 10“ 40 , as the ratio 

(16.4.1) of gravitational and electric forces in atoms, but this is equivalent to 

(16.4.2) , with e 2 /c in place of ft and m e 2/3 m p 1/3 in place of m K . 

If we choose to regard the numerical relation (16.4.2) as having a real though 
mysterious significance, then we must face the problem that in most cosmologies 
H 0 is not a constant, but a function of the age of the universe. One way of dealing 
with this problem is to replace H 0 with a quantity of comparable magnitude that 
is a constant ; for instance, in a closed Friedmann model we can use the reciprocal 
of the time it takes for the universe to expand to its maximum extent, while in a 
steady state model, the Hubble constant itself will do. The only trouble with this 
approach is that it does not lead anywhere, and in particular, it leaves us with 
fundamental dimensionless constants, such as (16.4.1) or Gm 2 /hc , which are 
inexplicably tiny. 

In 1937 a very different approach was suggested by Dirac. 23 He proposed 
that relations like (16.4.2) are fundamental though as yet unexplained truths, 
which remain valid, with a constant factor of proportionality, even though the 
Hubble “constant” R/R varies with the age of the universe. It follows then that 
one or more of the “constants” ft, G, c, and m n must vary over cosmic time scales. 
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In order to avoid having to reformulate the whole of atomic and nuclear physics, 
Dirac chose G as the ‘ ‘constant” that varies with time, and in order to preserve 
(16.4.2), he proposed that 


G oc 


R 

R 


(16.4.4) 


In addition, Dirac suggested that relations like (16.4.3) also remain true, with a 
constant factor of proportionality, as the universe expands. Since n oc R~ 3 , it 
follows that 

t>i 

GR - 3 oc y (16.4.5) 

R z 


Eliminating G(t) from (16.4.4) and (16.4.5) then yields a differential equation for 
R(t): 

R oc R~ 2 

with solution 

R oc t 1/3 (16.4.6) 

Either (16.4.4) or (16.4.5) then gives the gravitational constant a time dependence 

Goer 1 (16.4.7) 


Thus in Dirac’s cosmology, there is no fundamental significance to very small 
dimensionless numbers like 10“ 40 ; the reason that (16.4.1) is this small is simply 
that the universe is old. 

For k = +1, there are still significant cosmological parameters that are 
constant and grossly different from unity, such as the number nR 3 of particles 
within a sphere whose radius is of the order of the radius of curvature of space. 
To avoid this, Dirac also proposed that space is flat, with k — 0, so that the 
absolute value of the Robert son -Walker scale factor R(t) and pure numbers like 
nR 3 should be of no physical significance. 

If the gravitational constant varies, then general relativity needs to be replaced 
with some other field theory of gravitation. Dirac did not specify what this field 
theory would be like, so his cosmological model remained incomplete. Nevertheless, 
it made a number of definite predictions. First, Eq. (16.4.6) gives a relation between 
the present Hubble constant H 0 and the present age of the universe t 0 : 

to = i^o" 1 (16.4.8) 

Even for H 0 -1 as large as 13 x 10 9 years, this gives an age of only 4.3 x JO 9 
years, less than the age of the earth and moon determined by radioactive dating 
(which does not involve assumptions about G). Thus the Dirac theory appears 
already to conflict with observation. Equation (16.4.6) gives a deacceleration 
parameter q 0 — 2, which cannot be ruled out with present data. (See Section 
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14.6.) Finally, Eq. (16.4.7) gives a present rate of decrease of the “constant” of 
gravitation 

Q = -<o _1 = -3-ffo (16.4.9) 

The observable implications of a decreasing gravitational constant are discussed 
at the end of this section. 

Dirac’s theory inspired a number of attempts to formulate a field theory of 
gravitation in which the effective “constant” of gravitation is some function of a 
scalar field. Jordan 24 proposed one theory, which involved a nonconserved 
energy-momentum tensor, and was severely criticized on these and other grounds 
by Fierz 25 and Bondi. 26 A subsequent reformulation 27 removed most of these 
objections, but Jordan’s theory still did not successfully incorporate nonrelativistic 
matter. The most interesting and complete scalar-tensor theory of gravitation is 
that proposed by Brans and Dicke 28 in 1961, which we have already discussed in 
some detail in Sections 7.3 and 9.9. In this theory, the gravitational constant G 
is replaced with the reciprocal of a scalar field <£. In order to incorporate relations 
such as (16.4.3) into the theory, (f> is assumed to obey a field equation 

□ V = (<£;% = < 16 - 4 - 10 > 

o + 2(0 


where T is the energy- momentum tensor of matter (not including (f>) and m is a 
dimensionless coupling parameter. In order not to interfere with the successes of the 
Principle of Equivalence, (j) is assumed not to enter into the equations of motion of 
ordinary matter and radiation, so that T obeys the familiar conservation law 

T^. v = 0 (16.4.11) 

The Bianchi identities then require the gravitational field equations to take the 
form (7.3.14), or equivalently, 



This theory becomes equivalent to that of Jordan 27 in the special case of an 
energy-momentum tensor with vanishing trace. 

In applying the Brans-Dicke theory to cosmology, we again consider th* 
universe to be smeared out into a homogeneous isotropic continuum, as in Chapters 
14 and 15. The metric then has the Robertson- Walker form (14.2.1); the energy- 
momentum tensor has the perfect fluid form (14.2.12), and the scalar field (j) is a 
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function of time alone. A straightforward calculation using Eqs. (15.1.6), (15.1.7), 
(15.1.11), and (15.1.13) gives the time-time component of Eq. (16.4.12) as 


3 R 
~R 


8 n 


axp 2 <p 


, {(2 + co)p + 3(1 + co)p} 

(3 + 2(0)4) 4> 


while the space-space components of Eq. (16.4.12) give 

_R _ 2R 1 _ 2k 871 

~R ~R* ~ R 2 ~ 


<P 


<j)R 


(16.4.13) 


,, , 9 . , {(! + “)P - "ri + (16.4.14) 

(3 + 2(o)4> (pR 


and the time-space components simply say that zero equals zero. The field equation 
(16.4.10) for (p here reads 


i (<p£ 3 ) = (p - Zp) R3 (16.4.16) 

at (3 T 2co) 


and the conservation laws (16.4.11) give, as in Chapter 14, 


9 


3J? 

~R 


(. 9 + P) 


(16.4.16) 


By eliminating R from Eqs. (16.4.13) and (16.4.14), and using Eq. (16.4.15) to 
eliminate <p, we can derive a first-order equation analogous to (15.1.20): 


R 2 Jc Snp (pR co(p 2 

E i + R i ~^ ^R + 6ip 


(16.4.17) 


We can recover (16.4.13) and (16.4.14) from the derivative of (16.4.17), so the 
fundamental equations of the Brans-Dicke cosmology may be taken as Eqs. 
(16.4.15)-(16.4.17), plus an equation of state giving p as a function of p. In addi- 
tion, Eq. (9.9.11) shows that the gravitational “constant,” measured by the 
observation of slowly moving particles or in time-dilation experiments, is 


G = 


2(0 + 4 
2oj + 3 


(16.4.18) 


For any given equation of state p = p(p), Eqs. (16.4.15)-(16.4.17) may be 
regarded as a second-order differential equation plus two first-order differential 
equations for the three variables R, (p, and p. It follows that these equations 
uniquely determine i?(£), 0(0? an( l p(0 f° r a W t, provided we are given the present 
value of four variables, say R 0 , R 0 , 0 O , and p 0 , as well as the constants <o and k. 
This is rather surprising, because in a Friedmann model we only need to be given 
the initial values of three quantities, say R 0> R 0 , and of course G, in order to be 
able to calculate R(t) and p(t) for all t. [See Eq. (15.2.1).] 
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Originally, Brans and Dicke xo eliminated this extra degree of freedom by 
imposing one additional constraint, that (j)E 3 vanishes at the initial singularity 
where E — 0 : 

4>E 3 -► 0 for E -► 0 (16.4.19) 

With this initial condition, and with a given co, k, and equation of state, we can 
obtain a complete solution for E(t), p(t), and (p(t) by specifying only three present 
parameters, such as E 0 , E 0 , and (p 0 , or, more conveniently, H 0 , G 0 , and q 0 (or p 0 ). 

A few years later, Dicke 29 suggested that the relevant solutions may in fact 
be those that do not satisfy the constraint (16.4.19). In general, all solutions have a 
singularity with E = 0 at a finite time, which as usual we define to be t = 0. 
Equation (16.4.15) then has the solution: 

<j>(t)R 3 {t) = — i [p(t') - Zp(t')]R 3 {t') dt' - C (16.4.20 
2(» + 3 Jo 

where C is an integration constant, which may be positive, negative, or zero. For 
C = 0, we obtain the three-parameter family of models satisfying the initial 
condition (16.4.19). For C ^ 0, we obtain a four-parameter family of solutions, 
the extra parameter being needed to fix the value of C. 

The properties of these various solutions are sufficiently subtle to make it 
worth our while to study in some detail the one case where (16.4.15)-(16.4.17) can 
be solved analytically, the case of zero pressure and zero curvature : 


o 

II 

>■« 

o 

II 


Here (16.4.16) gives 


p oc E~ 3 

(16.4.21 

so Eq. (16.4.20) gives immediately 


•» = 0 87t \« y 

(16.4.22 

2co + 3 

where 


t _ (2(0 + 3 )C 
c SnpE 3 

(16.4.231 


It proves extremely convenient to introduce a new dependent variable 


« s (i - t c) i = ~ ^ )2 

(j) (2co + 3 )4> 


> 0 


(16.4.24 


By expressing p and </>/</> in Eq. (16.4.17) in terms of u, and setting k = 0, we can 
immediately solve for EjE \ 


2 (t - t c )E _ 


= —u + 


3 2m\ 

3 ) 


1/2 


( u 2 + 4 u) 1/2 


E 


(16.4.25 
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Also, Eq. (16.4.21) and the logarithmic derivative of (16.4.22) give 


u 

u 


3 JR 
~R 


+ 2 (* - cr 1 


4 > 


or, using (16.4.24) and (16.4.25), 

(£ — t c )u = \u jw + 4 

This first-order equation must be integrated to find u(t), following which (16.4.24) 
and (16.4.25) can be integrated to determine (j>(t) and R(t). 

One obvious class of solutions to Eq. (16.4.26) are those with u a constant, 
equal to one of the zeros of the expression on the right-hand side of Eq. (16.4.26). 
In order to have such a zero with u > 0, we must take the upper sign of the square 
root in Eqs. (16.4.26) and (16.4.25), and the solution is 

u = (16.4.27) 

3co + 4 




1/2 


( u 2 + 4 u) 1! A (16.4.26) 


For this solution, we must take t c = 0, because otherwise Eq. (16.4.25) would give 
R = 0 only at the time t = t c , and we have agreed to set out clocks so that this 
singularity occurs at t = 0. With t c — 0, Eqs. (16.4.24) and (16.4.25) yield the 
solutions 28 

(j) oc £ 2/(4 + 3co) (16.4.28) 

R oc /(2^+2)/( 301 + 4) (1.6.4.29) 


47ip£ 2 _ (2 co + 3) 
(j) (3m + 4) 


(16.4.30) 


For t c ^ 0, it is necessary to analyze how u in Eq. (16.4.26) moves between 

the singular points u — 0, u = 2/(3 co + 4), and u = 00 . The results depend 

critically on whether t c is positive or negative. 

t c > 0 . Here u drops monotonically from u = 00 at ^ = 0 to m = 0 at 

t = t c , and then rises monotonically to the value (16.4.27) as t -* 00 . The sign of 

the square root in Eqs. (16.4.25) and (16.4.26) switches at t c from the bottom sign 
for t < t c to the top sign for t > t c . The solutions of (16.4.26) are thus given by 


In 



-2 


du 

0 u{u + 4 + 3(1 + 2m/3) 1/2 (w 2 + 4 u) 1/2 } 


for t < t c 


'"( r 1 )' 2 


0 u{u + 4 - 3(1 + 2a>l3) l,2 (u 2 + 4 u) 1/2 } 
for t > t. 
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These integrals can be done in closed form, but it is more interesting to look at the 
behavior of the solutions at very early and very late times. For t t c , we find 

(3[1 + 2m/3] 1/2 - 1) 

(4 + 3 m)(£/£ c ) 

so Eqs. (16.4.24) and (16.4.25) have the solutions 29 

<f> OC *< l-3[l + 2 C o/3] 1 / 2 )/( 4 + 3 t o) (16.4.31) 

^ £(l+co + [l + 2a>/3]i/2)/(4+3co) (16.4.32) 

For t $> t c , u approaches the value (16.4.27), and the solutions of (16.4.24) and 
(16.4.25) go over to the forms (16.4.28) and (16.4.29). 

t c < 0. Here u drops monotonically from u = oo at t = 0 to the value 
(16.4.27) as t —*• oo. The square roots in (16.4.25) and (16.4.26) keep the upper sign, 
so (16.4.26) has the solution 


In 1 


u 

For t |£ c |, we find 


= 2 


du 


tt{3(l + 2<u/3) 1/2 (w 2 + 4 u) 111 - u - 4} 


u (3[1 + 2m/3] 1/2 + 1) 
(4 + Sco){t/\t c \) 

so Eqs. (16.4.24) and (16.4.25) have the solutions 29 

^ ^ ^(3[ 1 + 2cd/ 3] 1 / 2 + l)/(4 + 3 to) 
^ ^ ^(1+to — [1 + 2o»/3] 1 / 2 /(4 + 3 co) 


(16.4.33) 

(16.4.34) 


For t > \t c \, u approaches the value (16.4.27), and the solutions of (16.4.24) and 
(16.4.25) again go over to the forms (16.4.28) and (16.4.29). 

Thus there are three kinds of solution, all of which behave alike for t > \t c \, 
but which differ radically for t < |£J. Only the simple solution with t c = 0 goes 
over smoothly to the zero curvature Friedmann solution (</> constant, R oc £ 2/3 ) 
in the limit of large co ; the solutions with t c > 0 or t c < 0 have (f) -> oo or (j) -*■ 0. 
respectively, as t — > 0 for any finite co. 

Although these solutions were derived under the assumption of zero pressure 
and zero curvature, they exhibit many of the properties of the much more compli- 
cated general solutions. In general, the solutions may be classified according to 
whether the integration constant C in (16.4.20) is zero, or positive, or negative. 
For sufficiently large t , the integral in Eq. (16.4.20) is dominated by the matter- 
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dominated era, in which p oc R 3 , so the integral grows like t, and thus the 
integration constant eventually becomes negligible. In this limit, we have 

4> = - - ?L (16.4.35) 

2(0 + 3 

and all solutions converge to the C = 0 solution, which unfortunately must be 
calculated by a numerical solution of (16.4.17) and (16.4.35) with p oc R~ 3 . On the 
other hand, for t sufficiently small, the integration constant will dominate in 
(16.4.20), provided of course that u ^ 0. In this case, the curvature and density 
terms in (16.4.17) become negligible for t -» 0, and the solutions go over to the 
previously derived forms 


</> X ^l + 3 [ l + 2ai/3p/2 )/( 4 + 3ai) (16.4.36) 

R OC t^ 1 + £U±[1 + 2co/3] 1 /2)/(4+ 3<x>) (16.4.37) 

with upper sign for C > 0 and lower sign for C < 0. The C — 0 solutions go over 
smoothly to the Friedmann-model solutions for large co, but the O 0 solutions 
behave peculiarly at t = 0 for any co. 

The Brans- Dicke theory does not offer a satisfactory solution to the numerical 
relations discussed at the beginxiing of this section. Generally </)/</> and \jt will be 
of the order of the Hubble 4 ‘constant’ ’ H, and <p is of the order of 1 /G, so once the 
integration constant C becomes negligible, Eq. (16.4.35) will become more or less 
the same as the relation (16.4.3). However, Eq. (16.4.3) is not even approximately 
valid at very early times when C is not negligible. More important, the mysterious 
relation (16.4.2) is not explained at all by the Brans-Dicke theory. Indeed, in the 
simplest case of zero pressure, zero curvature, and zero integration constant t c , 
Eqs. (16.4.29) and (16.4.28) show that H oc \jt while G oc t~ 2/(4 + 3to) , so the mass 
(h 2 H IGc) i/3 decreases with time, and the relation (16.4.2) can only be valid for a 
brief period in the history of the universe. 

Now let us turn to the observational implications of this theory. Neither the 
gravitational field nor the Brans-Dicke field have any direct effect on the nuclear 
processes that are believed to produce helium in the early universe, but they do 
affect the rate of expansion of the universe, which in turn governs the amount of 
helium that can be produced. (See Section 15.7.) For solutions with (7 = 0, the 
numerical integration 29 of Eqs. (16.4.17) and (16.4.20) shows that in the case 
co = 5, Jc = 0, 11 Q ~ 1 = 9.5 x 10 9 years, p 0 = 2 x 10“ 29 g/em 3 , the effect of 
the Brans-Dicke field is to shorten the time required for the temperature to drop 
to 10 9o K by a factor of 0.45, so that more neutrons are left when nucleosynthesis 
begins, and the cosmologically produced abundance of helium is about 42% by 
weight, rather than 27%. For k = — 1 models with a smaller present density, the 
difference between the Friedmann and Brans-Dicke models is considerably less. 30 
On the other hand, with a non vanishing integration constant in (16.4.20), we can 
make the expansion rate in the early universe essentially anything we like. As 
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remarked in Section 15.7, for a moderate speed-up of the expansion the helium 
production is enhanced, but if the expansion is speeded up too much, then there 
will not be enough time for the reaction n+p^d + y to produce enough 
deuterium to initiate nucleosynthesis, and very little helium will be produced. 

During the more recent epoch within the range of optical telescopes, the 
integration constant C has presumably (though not certainly) been negligible, so 
that for large co the relations among curvature, density, age, Hubble constant, 
and deacceleration parameter are pretty much the same as for the Friedmann 
models. For instance, for a zero-pressure zero -curvature model in the limit t \t c \ . 
Eqs. (16.4.28)-(16.4.30) give the relations 


HqIq — 


% = 

4nGp 0 _ 
" 


(2 + 2 co) 

(16.4.38) 

(4 + 3 co) 


co -j- 2 

(16.4.39) 

2co + 2 


(4 + 3co)(4 + 2co) 

(16.4.40) 

(2 + 2co) 2 



For co — 6 these three quantities have the values 0.64, 0,57, 1.80, while the 
corresponding values for a Friedmann model with Jc = 0 are 0.67, 0.50, and 1.50. 

Certainly the most distinctive observable feature of both the Dirac and the 
Brans-Dicke theories is the decrease of the gravitational constant G with time. 
In the Brans-Dicke theory the present rate of change of G is given by (16.4.35) 
as 


G\ = = 8np 0 t 0 = 87i£ 0 jV 0 

Gj o v/yo (2ft) + 3)0 O (2 ft) + 4) 


(16.4.41) 


In general, in order to express p 0 and t 0 in terms of G 0 , H 0> and q 0 , it would be 
necessary to resort to a numerical solution of the differential equations (16.4.35 
and (16.4.17). However, if co is reasonably large (say co > 5), then the rate of 
decrease of G may be calculated to a sufficient degree of accuracy by using for p Cl 
and t 0 in (16.4.41) the values calculated using the Einstein equations in Section> 
15.2. and 15.3. The general Friedmann- model result for p 0 is 


4jiG 0 p 0 _ „ 


(16.4.42) 


Values for H 0 t 0 and the resulting values for the rate (16.4.41) are given in Table 

16.1. 

In the special case k = 0, where we have an analytic solution, Eqs. (16.4.18). 
(16.4.28), and (16.4.29) yield the “exact” result 


Ho 


0 


(1 + co) 
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so the estimate of this rate given in Table 16.1 is in this case about 12% too low 
for co = 6. The estimates in Table 16.1 are in good agreement with the “exact” 
results that have been computed 30 for h = — 1 and co — 5 or co = 10. 


Table 16.1 Rate of Decrease of G in Various Brans-Dicke Models and in the 
Dirac Model a 


Model 


t G H Q (co = oo) 

(GIG) o 

Brans-Dicke 

<1 

1 



1 

2 

co + 2 

Brans-Dicke 





2 

3 

co + 2 

Brans-Dicke 

1 

71 

~ - 1 

1.71flo 

Brans-Dicke 

>1 

2 

71 

co + 2 

3.341% q 0 

Dirac 

2 

2 V 2 5 r o 

1 

(co + 2) 

-3 H 0 



3 


a The values of ( GjO) 0 in the Brans-Dicke models are estimated from Eq. (16.4.41), using for to 
and p 0 the Friedmann model (i.e., co = 00) results given in the third column and in Eq. (16.4.42), 
respectively. 


For = 10 10 years. q 0 between 0.01 and 1.0, and co = 6, Table 16.1 

gives a rate of decrease in G between 4 x 10“ 13 parts per year and 2 x 10 -11 
parts per year. In contrast, the Dirac model predicts a much more rapidly decreas- 
ing gravitational “constant”; for H 0 ~ 1 = 10 10 years, the rate of decrease of G is 
3 x 10“ 10 parts per year. 

The best experimental upper limit on the present rate of change of G comes 
from the analysis of radar observations of Mercury and Venus. 3 1 For a planet in a 
circular orbit with radius r and velocity v, we have M Q G = v 2 r, so if the orbital 
angular momentum mrv stays fixed while G changes, then r and v will vary as 

r cc - oc — (16.4.43) 

v G 


and the orbital period 2 nrjv will vary as 
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By repeated comparison of the orbital periods of the inner planets over the period 
1966-1969, with time as told by an atomic clock (which does not depend on G), 
Shapiro et al. 31 have set an upper limit 

< 4 x 10” 10 /year 
0 

This is almost good enough to rule out the Dirac theory, but is not yet sufficiently 
stringent to put a useful limit on the Brans -Dicke coupling parameter co. However, 
the error in these measurements of ( GjG) 0 is expected to decrease approximately 
with the 5/2 -power of the time span of the observations, so another five years of 
observation should reduce the upper limit on GIG to the value expected in a 
Brans-Dicke model with q Q of order unity and co = 6. 

There is also a prospect of setting an upper limit on the rate of change of G 
from analysis of the flight time of laser signals, sent from the earth to the moon, 
and reflected back to earth by the corner reflectors placed on the moon’s surface 
by the Apollo expeditions. 32 However, the analysis of these observations is 
seriously complicated by tidal effects, which play a major role in the dynamics of 
the earth-moon system. (Fortunately, such tidal effects do not seriously affect the 
planetary motions that were studied by Shapiro et al.) 

Variations in G over the last few millenia can perhaps be determined from the 
study of ancient eclipse records. 33 A total eclipse of the sun occurs only over a 
very small portion of the earth’s surface, so the knowledge that a particular total 
eclipse was seen at some particular place provides precise information on the 
ratio of the length of the day, which does not strongly depend on G, to the length 
of the year and the lunar month, which vary like l/G 2 . The analysis by Curott 34 
and Dicke 35 of five eclipses, which occurred between 1062 b.c. and 71 a.d., gives 
an average rate of decrease in the earth’s rotation rate relative to planetary 
periods of (15.9 ± 0.7) x 10” 1 1 parts per year. The earth’s rotation rate is subject 
to a number of known influences, 36 in particular a tidal deacceleration, between 
23.5 x 10“ 11 and 25.6 x 10 11 parts per year, and an acceleration owing to the 
rise in sea level and the isostatic recovery of the geoid, between 0.5 x 10” 1 1 and 
3.0 x 10” 1 1 parts per year. This leaves a residual unexplained acceleration of the 
earth’s rotation between 4 x 10” 11 and 10 x 10” 11 parts per year. Since the 
eclipse data measure the rate of the earth’s rotation relative to planetary periods, 
this apparent residual acceleration could be explained in terms of a deacceleration 
of planetary motions, owing to a decrease of G at a rate between 2 x 10“ 1 1 and 
5 x 10” 11 parts per year. [See Eq. (16.4.44).] However, the eclipse data are 
somewhat ambiguous. (Was Archilochus on Paros or Thasos during the eclipse of 
648 B.c.?) More important, there are many uncertainties in the complicated 
dynamics of the earth-moon system that could account for the small residual 
apparent acceleration of the earth’s rotation, without appealing to a decrease 
in G. 

It may be possible to measure changes over the past 350 million years in the 
number of days in a lunar month or a year, by counting monthly or annual growth 


G 

G 
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bands and daily growth ridges on fossil corals. 37 However, this approach has not 
yet yielded results that are precise enough to be of use to cosmologists. 

A secular decrease in the constant of gravitation over billions of years would 
have interesting effects on the evolution of the earth and stars, but unfortunately, 
none of these effects would yield unambiguous information about whether G 
really does decrease. With decreasing G, the radius of the earth would increase 
roughly as G~ 0 ' 1 , causing complicated damage to the earth’s crust. 38 If G were 
larger in the past, then stars would have run through their thermonuclear evolution 
more rapidly; 39 for G decreasing at a rate of (1-2) x 10 -11 parts per year, a 
star whose true age is 6 to 8 billion years would appear to us to be 15 to 25 billions 
years old. 40 Finally, if G were greater in the past, then the sun’s luminosity L Q 
would have been greater by a factor 41 roughly proportional to G 8 , and the radius 
r 0 of the earth’s orbit would have been smaller by a factor proportional to G~ 1 , 
so the surface temperature of the earth, which varies more or less as (A o /r 0 2 ) 1/4 , 
would have been greater by a factor proportional to G 2 ' 5 . If G decreases as t~ 0m09 , 
as would be expected according to Ea. (16.4.28) for a Brans-Dicke model with 
Ic = 0 and a> = 6, and if the age of the universe is 8 x 10 9 years, then the tem- 
perature of the earth’s surface 2 x 10 9 years ago would have been only about 
20°C higher than at present, which need not have had any drastic effect on biolog- 
ical evolution. On the other hand, if G has decreased as 1 jt as expected in Dirac’s 
cosmology, then the temperature of the earth’s surface 10 9 years ago would have 
been above the boiling point of water, unless the earth’s albedo was very much 
higher than at present. 42 Thus, too large an early value of the constant of grav- 
itation could have prevented the evolution of life forms capable of curiosity about 
the universe. 
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APPENDIX 

SOME USEFUL NUMBERS* 


Numerical Constants 


7 i = 3.1415927 1" = 4.8481 x 10 6 radians 

e = 2.7182818 In 10 = 2.3025851 


Physical Constants 


Speed of light 
Gravitational constant 

Planck’s constant 


h 

Electron volt 
Electronic charge (unrat.) 


c = 2.9979250(10) x 10 10 cm sec -1 
G = 6.6732(31) x 10“ 8 dyn cm 2 g“ 2 
G/c 2 — 7.425 x 10“ 29 cm g“ 1 

h = 6.582183(22) x 10“ 16 eV sec 
= 1.0545919(80) x 10“ 27 ergsec 
2nh = 6.625 x 10“ 27 erg sec 
1 eV = 1.6021917(70) x 10“ 12 erg 
e = 4.803250(21) x 10“ 10 esu 


* Taken from “Review of Particle Properties,” Particle Data Group, Rev. Mod.-Phys. 43, No. 2, Part II, 
(1971) and Astrophysical Quantities, by C. W. Allen (Athlone Press, London, 1955). Where figures are 
given in parentheses, they indicate the one standard deviation uncertainty in the last digits of the main 
numbers. 
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Fine structure constant 

a = e 2 lhc = 1/137.03602(21) 

Electron mass 

m e = 9.109558(54) x 10" 28 g 


m e c 2 = 0.5110041(16) MeV 

Proton mass 

m p = 1.67 x 10 -24 g 


m p c 2 = 938.2592(52) MeV 

Neutron mass 

m„c 2 = 939.5527(52) MeV 

Rydberg 

m„e 4 /2 h 2 = 13.605826(45) eV 

Thomson cross-section 

8rte 4 /3m e 2 c 4 = 0.6652453(61) x 10“ 24 cm 2 

Weak coupling constant 

g^jh 2 = 1.02 x 10 ~ 5 m p ~ 2 

Boltzmann constant 

k = 1.380622(59) x 10^ ,6 erg°K _1 


i T 1 = 1 1604.85(49)°K/eV 

Black-body constant 


a = 

n 2 k*/l5c 3 h 3 = 7.5641 x 10~ 1 5 erg cm" 3o K -4 

Typical stellar mass {hcjG) 3/2 m p 2 = 3.77 x 10 33 g 

General Astronomical Constants 

Sidereal year (1900) 

1 year = 3.1558149984 x 10 7 sec 

Light year 

1 light year = 9.4605 x 10 17 cm 

Mean earth-sun distance 

1 a.u. = 1.495985(5) x 10 13 cm 

Parsec 

1 pc = 3.0856(1) x 10" 8 cm 


= 3.2615 light year 


Hubble time for a Hubble constant of 100 km sec 1 Mpc 1 
[100 km sec - 1 Mpc" x ]" 1 = 9.78 x 10 9 years 
Solar mass M 0 = 1.989(2) x 10 33 g 

M 0 Gjc 2 = 1.475 km 

Solar radius R 0 = 6.9598(7) x 10 5 km 

Dimensionless solar surface potential 

MqG/RqC 2 - 2.12 x 10" 6 

L 0 = 3.90(4) x 10 3 3 erg sec 
M @ = 5.977(4) x 10 27 g 
ivi ^Gjc 2 = 0.443 cm 


Solar luminosity 
Earth mass 



Some Useful Numbers 6^J 

Earth equatorial radius E @ — 6.37817(4) x 10 3 km 

Dimensionless earth surface potential 

M^G/R^c 2 = 6.95 x 10" 10 
Acceleration due to gravity at earth’s surface 

g = 980.665 cm sec" 2 
Velocity of earth satellite in low orbit 

v s — 7.9 km sec" 1 

Mean orbital velocity of earth — 29.78 km sec" 1 

Lunar mass M ^ = 7.35 x 10 25 g 

M^jGjc 2 — 5.45 x 10" 3 cm 
Lunar radius R ^ = 1738 km 

Dimensionless lunar surface potential 

M l G/R l c 2 - 3.14 x 10“ 11 
Mean earth-moon distance r ^ = 3.84 x 10 5 km 

Apparent luminosity of star with apparent bolometric magnitude m 

l = 2.52 x 10" 5 erg cm" 2 sec" 1 x I0~ 2m/5 
Absolute luminosity of star with absolute bolometric magnitude M 

L = 3.02 x 10 35 erg sec" 1 x 10“ 2M/5 


Elements of Planetary Orbits 


Planet 

Sj'mbol 

Period 

T (trop year) 

Semilatus Rectum 
L( 10 6 km) 

Eccentricity 
e (in 1900) 

Icarus 


1.12 

51.0 

0.827 

Mercury 

$ 

0.24085 

55.46 

0.205615 

Venus 

? 

0.61521 

108.20 

0.006820 

Earth 

0 

1.00004 

149.54 

0.016750 

Mars 


1.88089 

225.95 

0.093312 

Jupiter 


11.86223 

776.5 

0.048332 

Saturn 


29.45772 

1423 

0.055890 

Uranus 

£ 

84.013 

2863 

0.0471 

Neptune 

¥ 

164.79 

4498 

0.0085 

Pluto 

£ 

248.4 

5500 

0.2494 
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Selected Galaxies* 


Distance cz (obs.) 


Local Group 

Type 

(Mpc) 

P9 

(km/sec) 

Galaxy and Nearest Neighbors 




Galaxy 

Sb or Sc 

— 

— 

— 

LMC 

Ir or SBc 

0.049 

0.86 

+ 280 

SMC 

Ir 

0.058 

2.86 

+ 167 

Ursa Minor 

dE 

0.077 

? 

? 

Draco 

dE 

0.08 

? 

? 

Sculptor 

dE 

0.09 

10.5 

? 

Fornax 

dE 

0.13 

9.1 

+ 40 

Leo I 

dE 

0.23 

11.27 

? 

Leo II 

dE 

0.23 

12.85 

? 

NGC6822 

Ir 

0.52 

9.21 

- 40 

Other Members of Local Group 




NGC224(M31) 

Sb 

0.65 

4.33 

— 270 

NGC205 

E6p 

0.65 

8.89 

-240 

NGC221(M32) 

E2 

0.65 

9.06 

-210 

NGCI47 

dE4 

0.65 

10.57 


NGC185 

EO 

0.65 

10.29 

-340 

NGC598(M33) 

Sc 

0.74 

6.19 

-210 

IC1613 

Ir 

0.74 

10.00 

-240 

Maffei 1 

E 

~1 

'■>*' 5.8(vis) 


Miscellaneous Bright 

Galaxies 




NGC3031(M81) 

Sb 

2.0 

7.85 

+ 80 

NGC3034(M82) 

Sep 

2.0 

9.20 

+ 400 

NGC5236(M83) 

Sc 

2.4 

7.0 

+ 320 

NGC4826(M64) 

? 

3.7 

9.27 

+ 360 

NGC5128(Cen A) 

EOp(R) 

~4.0 

7.87 

+ 260 

NGC4736(M94) 

Sbp 

4.3 

8.91 

+ 350 

NGC5055(M63) 

Sb 

4.3 

9.26 

+ 2600 

NGC5194(M51) 

Sb 

4.3 

9.26 

+ 550 

NGC5457(M101) 

Sc 

4.3 

8.20 

+ 400 


* Distances, types, and magnitudes are mostly taken from the compilation of S. van den Bergh, Observors 
Handbook of the Royal Canadian Astronomical Society, 1971. Under “Type,” E denotes “elliptical,” with 
EO, El, . . . increasingly flat; S denotes “spiral,” with SO, Sa, Sb, Sc, . . . increasingly open; SB denotes 
“barred spiral,” with SBO, SBa, SBb, SBc, . . . increasingly open; Ir denotes “Irregular”; p denotes 
“peculiar”; d denotes “dwarf”; R denotes “strong radio source.” For Maffei 1, see H. Spinrad et al., 
Ap. J., 163, L25 (1971). 
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Messier Galaxies in the Virgo Cluster (Visual Magnitudes) 


NGC4472(M49) 

NGC4579(M58) 

? NGC462l(M59) 

NGC4649{M60) 
NGC4303(M61) 
NGC4374(M84) 
NGC4382(M85) 
NGC4486(M87) 
NGC4501(M88) 
NGC4552(M89) 
NGC4569(M90) 

? NGC4192(M98) 

NGC4254(M99) 
NGC4321(M100) 
NGC4594(M104) 


E4 

15 ± 5 

SBb 

15 ± 5 

E5 

15 ± 5 

E2 

15 ± 5 

Sc 

15 + 5 

E ? 

15 ± 5 

SO 

15 ± 5 

EOp(R) 

15 + 5 

Sb 

15 ± 5 

EO 

15 ± 5 

Sb 

15 ± 5 

Sb 

15± 5 

Sc 

15 + 5 

Sc 

15 ± 5 

Sb 

15 ± 5 


8.9 

9.9 

10.3 

9.3 

9.7 

9.8 

9.5 

9.3 +1220 

9.7 

10.3 
9.7 

10.4 

9.9 

9.6 

8.1 +1020 


Selected Clusters of Galaxies 


Cluster Est. No. Galaxies cz( km/sec) 


Virgo 

2500 

1150 

Pegasus I 

100 

3800 

Pisces 

100 

5000 

Cancer 

150 

4800 

Perseus 

500 

5400 

Coma 

1000 

6700 

Hercules 


10300 

Pegasus II 


12800 

Cluster I 

400 

15800 

Ursa Major I 

300 

15400 

Leo 

300 

19500 

Gemini 

200 

23300 

Cor. Bor 

400 

21600 

Bootes 

150 

39400 

Ursa Major II 

200 

41000 

Hydra 


60600 
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post-Newtonian approximation, 244-248 
precession of perihelia, 197 
Robertson parameters, 183 
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Bulk viscosity, 55-57,568-569, 593-594 
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Contraction of indices, 37, 69, 100, 105 
Contravariant vectors and tensors, 35-39, 94-98 
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Electromagnetic force, 42, 125 
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classification, 638 
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densities, 477-478 
density and current, 414-415, 460 
evolution, 444-446, 449 
formation, 561-570, 573-578, 616 
masses, 476-477, 570 
mass-to-light ratios, 477-479 
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selected galaxies, 638—639 
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true nature discovered, 435 
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Cygnus A; Galaxy (Milky Way); M31; 
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